CLI tool for remote notebook execution on BERDL JupyterHub.
Execute notebooks and shell commands on your BERDL Hub environment from your local machine.
# Install directly from GitHub
pip install "git+https://github.com/BERDataLakehouse/berdl_remote.git"
# For development (from source)
git clone https://github.com/BERDataLakehouse/berdl_remote.git
cd berdl_remote
pip install -e .
# With development dependencies
pip install -e ".[dev]"# Add to pyproject.toml dependencies
berdl-remote = { git = "https://github.com/BERDataLakehouse/berdl_remote.git", rev = "main" }The easiest way to configure berdl-remote is using the BERDL Access Request Extension in JupyterLab:
- Open JupyterLab on hub.berdl.kbase.us.
- Click the "Get Credentials" button (key icon) in the toolbar.
- Click Download Config File (saves as
remote-config.yaml). - Move the file to
~/.berdl/remote-config.yaml.
Alternatively, you can copy the configuration text from the modal and paste it into ~/.berdl/remote-config.yaml.
Manual Method (DevTools)
- Log in to Hub.
- Open Browser DevTools (F12 → Application → Cookies).
- Copy
_xsrf,jupyterhub-session-id, andjupyterhub-user-USERNAME. - Run
berdl-remote configure.
berdl-remote configureFollow the prompts to enter your Hub URL, username, and cookies.
Configuration is saved to ~/.berdl/remote-config.yaml.
berdl-remote statusImportant: Your Jupyter server must be running with at least one notebook open (for an active kernel).
Check connection to your Jupyter server.
berdl-remote statusExecute a notebook with optional parameters. Creates a separate output file.
# Basic execution (local file in your home directory)
berdl-remote run /home/myuser/notebooks/analysis.ipynb
# With parameters
berdl-remote run /home/myuser/notebooks/analysis.ipynb \
-p batch_size 100 \
--output /home/myuser/notebooks/analysis_executed.ipynb
# Using S3/MinIO paths directly
berdl-remote run s3://cdm-lake/users-general-warehouse/myuser/notebooks/analysis.ipynb \
--output s3://cdm-lake/users-general-warehouse/myuser/notebooks/analysis_executed.ipynbExecute a notebook in place using nbconvert.
berdl-remote nbconvert /home/myuser/notebooks/quick_test.ipynb --inplaceExecute Python code directly on the remote kernel. Has access to all kernel variables (spark, get_settings, etc.).
# Print settings
berdl-remote python "print(get_settings().USER)"
# Run Spark queries
berdl-remote python "spark = get_spark_session(); spark.sql('SHOW DATABASES').show()"
# Check environment
berdl-remote python "import os; print(os.environ.get('MINIO_ENDPOINT_URL'))"Execute shell commands on the remote server.
# List files
berdl-remote shell "ls -la /minio/my-files/notebooks/"
# Check papermill version
berdl-remote shell "papermill --version"
# Execute notebook with papermill manually (via shell)
berdl-remote shell "papermill s3://cdm-lake/users-general-warehouse/myuser/notebooks/analysis.ipynb s3://cdm-lake/users-general-warehouse/myuser/notebooks/analysis_executed.ipynb"For detailed instructions on accessing MinIO, configuring the MinIO client (mc), and using Python/boto3 with BERDL MinIO, please refer to the BERDL MinIO Guide.
Files are stored in the cdm-lake bucket with this structure:
| Type | S3 Path |
|---|---|
| Personal files | s3://cdm-lake/users-general-warehouse/YOUR_USERNAME/ |
| Personal SQL warehouse | s3://cdm-lake/users-sql-warehouse/YOUR_USERNAME/ |
| Tenant files | s3://cdm-lake/tenant-general-warehouse/TENANT_NAME/ |
| Tenant SQL warehouse | s3://cdm-lake/tenant-sql-warehouse/TENANT_NAME/ |
Example for user myuser:
- Notebook location:
s3://cdm-lake/users-general-warehouse/myuser/notebooks/test.ipynb
# Execute notebook from S3/MinIO
berdl-remote run s3://cdm-lake/users-general-warehouse/myuser/notebooks/test.ipynb \
--output s3://cdm-lake/users-general-warehouse/myuser/notebooks/test_executed.ipynb
# Using shell (direct papermill control)
berdl-remote shell "papermill s3://cdm-lake/users-general-warehouse/myuser/notebooks/test.ipynb s3://cdm-lake/users-general-warehouse/myuser/notebooks/test_executed.ipynb"
# With parameters
berdl-remote run s3://cdm-lake/users-general-warehouse/myuser/notebooks/analysis.ipynb \
--output s3://cdm-lake/users-general-warehouse/myuser/notebooks/analysis_output.ipynb \
-p date 2024-01-15#!/bin/bash
# run_analysis.sh
# 1. Upload notebook to MinIO
mc cp analysis.ipynb berdl-minio/cdm-lake/users-general-warehouse/myuser/notebooks/
# 2. Execute remotely with parameters
berdl-remote run s3://cdm-lake/users-general-warehouse/myuser/notebooks/analysis.ipynb \
-p date "2024-01-15" \
-p data_source "s3://bucket/data/" \
--output s3://cdm-lake/users-general-warehouse/myuser/notebooks/analysis_2024-01-15.ipynb
# 3. Download results
mc cp berdl-minio/cdm-lake/users-general-warehouse/myuser/notebooks/analysis_2024-01-15.ipynb ./results/hub_url: https://hub.berdl.kbase.us
username: your_username
cookies:
_xsrf: "abc123..."
jupyterhub-session-id: "xyz789..."
jupyterhub-user-your_username: "token..."# Use a specific config file
berdl-remote --config ~/.berdl/prod-config.yaml status
berdl-remote --config ~/.berdl/dev-config.yaml status- Credentials are stored in
~/.berdl/remote-config.yamlwith file permissions set to0600 - Session cookies expire; you may need to reconfigure periodically
- Never share your config file or cookies