-
Notifications
You must be signed in to change notification settings - Fork 36
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feature Request: Cache Values #16
Comments
Thanks @AnotherCodeArtist happy to hear that 😃! Yes, the idea of having something that carry over values to another cell is high on my want list -- it's the item "Library to easily store/retrieve calculated content." in the TODO list. The thing is, GoNB works by recompiling and re-executing at every cell execution, so it's not that the value can stay alive. One work around that comes to mind is to have a library that quickly serializes/deserializes on demand. So your example would look like: var lotsOfData = CacheValue(func () Data { return LoadOverTheInternet("https://.....") }, "lotsOfData") Where: func CacheValue[T any](fn func() T, key string) T {
...
} Would try to first read the value So it actually runs the For really large blobs of data this may still not be good enough, but then maybe one could memory-map a file with the data in binary formal (e.g; a large array of floats). It would require special memory management, but it's easy to make this manageable. Any thoughts ? I suppose that's what you meant with caching the data ? Btw, on the cheers |
Hi Jan! Sounds like a good first shot. Would be even cooler if there were a chance to provide some cell magic like Anyhow, here's a working ARG BASE_IMAGE=jupyter/base-notebook
ARG BASE_TAG=python-3.10
FROM ${BASE_IMAGE}:${BASE_TAG}
USER $NB_USER
ENV GOVERSION=1.20
USER root
WORKDIR /root
RUN wget https://dl.google.com/go/go$GOVERSION.linux-amd64.tar.gz && \
tar -C /usr/local -xzf go$GOVERSION.linux-amd64.tar.gz
RUN apt-get update && apt-get install -y git libtool pkg-config build-essential autoconf automake uuid-dev libzmq3-dev
USER $NB_USER
WORKDIR /home/jovyan
ENV GOROOT=/usr/local/go
ENV GOPATH=/home/jovyan/go
ENV PATH=$PATH:$GOROOT/bin:$GOPATH/bin
# Install GoNB (https://github.com/janpfeifer/gonb)
RUN go install github.com/janpfeifer/gonb@latest && \
go install golang.org/x/tools/cmd/goimports@latest && \
go install golang.org/x/tools/gopls@latest && \
gonb --install
WORKDIR /home/jovyan/work
USER root Build it with docker build -t gonb:latest . Run it with docker run -p 8888:8888 --rm gonb:latest |
On the cache: I'm hesitant to create the cell magic -- I'm a big fan of making things explicit, even if requires a bit more typing. Also because the cache system is also useful outside GoNB. So if it can use normal Go language to achieve the same thing, I think it is a plus (one less thing to be learned by the end-user). Thanks for the Dockerfile! I'll add it this weekend, and generate one in Docker Hub so folks can simply pool from it. |
Thx again @AnotherCodeArtist . I added a few more things to your initial Let me see if I can cook a Cache Values library next. |
I took an initial stab at it, check it out in c9a1f3198096180f63042cd667675ddee8c7f2bc. I haven't yet created a new release. I'll give it a few days, if you see any issues, or it doesn't work for you let me know. If everything works I'll create a section in the tutorial about it and the 0.6 release. |
Hi Jan! Just tried to use the new
My dependencies in the docker image are RUN go install github.com/janpfeifer/gonb@c9a1f3198096180f63042cd667675ddee8c7f2bc && \
go install golang.org/x/tools/cmd/goimports@latest && \
go install golang.org/x/tools/gopls@latest && \
gonb --install Nevertheless, it seems that 0.5.1 gets downloaded when running var phonebook = cache.Cache("my_data", func() *Data { return LoadPhoneBookFile() }) Do I require any additional dependencies/imports? |
Sorry I probably should have explained. But the Try this, in 3 different cells:
!*go get -u github.com/janpfeifer/gonb/cache@c9a1f3198096180f63042cd667675ddee8c7f2bc
import (
"math/rand"
"github.com/janpfeifer/gonb/cache"
)
var (
a = rand.Intn(100)
b = cache.Cache("b", func() int { return rand.Intn(100) })
)
%%
fmt.Printf("a=%d, b=%d\n", a, b) |
Hi Jan! Works great for my scenario! Just a thought: Would it be possible to use some in-memory database (which needs to be started and controlled by the kernel) for caching or would it have no influence on execution time anyway? |
Nice, I'm happy it worked. So about having an in-memory database of sorts to store the cache: the thing is that the OS is pretty good with caching of disk: in most cases (if we are not talking GB of data) interactively working on the notebook everything will be in memory anyway. Another inefficiency of the OS filesystem may be the number of files, if you start having thousands or millions of cached values. In those cases one reasonable option would be packing collections of values into a container, and caching the container instead ? Also, notice that the If none of those work for you, let me know what is your scenario. The API is flexible to support different type of backends -- one could create a |
Thanks for the hint with the ramdisk. As I've already pointed out, if there's no significant performance gain it's not worth going through the hassle! |
Nice. Closing the issue then. Next weekend I'll create a new release, and update the tutorial. |
Hi Jan!
Thanks for the great work. It already became my favorite Go kernel and I'm using it on a JupyterHub cluster.
One thing, however, would be great: When declaring functions, types or variables they can be re-used over multiple cells. But, if a variable is holding the results of a function call
and this variable is used later in another cell
then not the initial result that was loaded in the previous cell is used, but the function call is executed again.
Is there any chance to cache the data instead of executing the function over and over again?
BTW: If you need a
Dockerfile
, I already have (although a bit specific, since it is running in our cluster with some custom modifications)The text was updated successfully, but these errors were encountered: