-
Notifications
You must be signed in to change notification settings - Fork 420
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
New feature: Read input notebook from github #556
Comments
I'd like to have it. |
This is a good pattern to use for reading from git as a read-only source. If someone wanted to invest a little time in making a new IO Handler for reading git this library would be useful to use: GitPython. I'd be happy to review / merge such an improvement. |
If you're using the CLI the terminal outputs progress (there's a few options to control this). Additionally it's saving the notebook output after each cell and periodically within a cell so refreshing the destination location in a notebook browser will show progress as well, albeit not in real-time necessarily. |
@MSeal I use the instance's startup script, which uses papermill to execute the notebook. I run 'gcloud reset' on the machine so it would be started and the startup script will run. Is there a different way I can remotely run the notebook and also see its progress as you said? |
@MSeal |
@onevirus That sounds reasonable for what you're targeting. I can imagine a more general git solution as well since there's a lot of git repos that aren't github/gitlab. But that being said github is the most popular in open source so I think optimizing for that end is worth the effort. |
@ronytesler this is somewhat a different topic than the issue that was opened here, but usually you have the startup script logging to a logging sink that captures the stdout/stderr and makes it available to view. Papermill in and of itself doesn't manage this as it's a bit out of scope of the project. Managed execution of VMs or containers isn't the easiest to navigate but most of the solutions involve monitoring those standard outputs and triggering said executors on demand in some execution context. In this story arch papermill's responsibility is to output log text, notebook saves, and manage the kernel locally within that context. |
In my org, I store every notebook in github and tweak papermill to read notebook from github directly.
Because we use papermill heavily in production and we need to version notebooks.
This is how we use
Take just url of notebook like binder and nbviewer.
This has some pros.
Some teams use only dev / master branches.(Read from dev branch in dev env, read from master branch in prd env)
Other teams use tagging for versioning notebooks.
We don't need storage for notebooks.(We put output notebooks in gcs)
How do you think ?
The text was updated successfully, but these errors were encountered: