Docker images providing JupyterLab with a Ruby (IRuby) kernel, a Python NLP bridge, and a broad gem collection for LLM integration, DSPy, vector search, and terminal UI development.
Two images stack on top of each other: base (JupyterLab + Python AI stack) and nlp (Ruby 3.3.8 + 100+ gems, built on base). Run the nlp image directly; base exists as a foundation.
- IRuby Kernel — Ruby runs natively inside JupyterLab notebooks; no subprocess wrappers, no language switching
- Python NLP Bridge —
pycall+ruby-spacyexpose spaCy's full pipeline (NER, tokenization, dependency parsing) from Ruby code - DSPy & LLM Stack — Full
dspysuite with provider adapters, plusruby_llm,langchainrb,groq,ollama-ai, andrllamafor local GGUF inference - Vector & Semantic Search — pgvector (PostgreSQL), Chroma DB, and Redis available without extra setup when running via Compose
- Async Runtime —
async,falcon,circuit_breaker, andjongleurfor non-blocking workflows alongside synchronous notebook code - Charm/Bubble TUI —
bubbletea,glamour,lipgloss,gum, and the fulltty-*toolkit for terminal UIs built and tested in notebooks - CUDA/CPU Toggle — The
baseimage supports aCUDA_SUPPORTbuild arg that switches all PyTorch-backed installs between CPU and CUDA 12.1 wheels
- Podman or Docker
- Podman Compose or Docker Compose v2
Podman is the default runtime. Replace podman / podman-compose with docker / docker compose throughout if preferred.
git clone https://github.com/b08x/docker-jupyter-ruby.git
cd docker-jupyter-ruby
bundle install
rake build/nlp # Builds base then nlp (recommended)
rake build/base # Base image only
rake build-all # Both images in sequenceThe Rakefile detects the container engine automatically (checks for a running dockerd, falls back to podman). Image ownership is read from DOCKER_USER env var, defaulting to $USER.
Direct build (skips Rake):
podman build --format docker -f base/Containerfile --rm -t b08x/notebook-base:latest .
podman build --format docker -f nlp/Containerfile --rm -t b08x/notebook-nlp:latest .mkdir -p ./work
podman run --rm -p 8888:8888 \
-v "${PWD}/work":/home/jovyan/work \
--user "$(id -u):$(id -g)" \
b08x/notebook-nlp:latestOpen http://localhost:8888. The authentication token appears in container stdout.
Starts nlp-notebook, redis (redis-stack), and pgvector (PostgreSQL + pgvector):
cp compose.yaml.example compose.yaml # Customize volumes, ports, or GPU settings
cp .env.example .env # Set UID, GID, WORKSPACE, and API keys
mkdir -p ./data
podman-compose up -d
podman-compose logs nlp-notebook | grep token # Get Jupyter token
podman-compose downcompose.yaml.example is the baseline — it omits personal host directory mounts present in the default compose.yaml. Edit the copy to add any additional volume bindings before starting.
Service endpoints:
- Jupyter:
http://localhost:8888 - RedisInsight:
http://localhost:8003 - PostgreSQL/pgvector:
localhost:5432(database:rubynlp, user:postgres)
require 'ruby-spacy'
require 'terminal-table'
nlp = Spacy::Language.new('en_core_web_sm')
doc = nlp.read("Apple Inc. is planning to open a new store in San Francisco.")
rows = doc.ents.map { |ent| [ent.text, ent.label, ent.start_char, ent.end_char] }
puts Terminal::Table.new(headings: ['Entity', 'Type', 'Start', 'End'], rows: rows)ruby-spacy delegates to the Python spaCy process via pycall. The respond_to_missing.patch in nlp/ keeps the delegation working across pycall versions.
require 'langchain'
llm = Langchain::LLM::OpenAI.new(api_key: ENV['OPENAI_API_KEY'])
assistant = Langchain::Assistant.new(llm: llm, instructions: "You're a Ruby expert")
assistant.add_message_and_run!(content: "Explain procs vs lambdas")
puts assistant.messages.last.contentlangchainrb supports OpenAI, Groq, Ollama, Google, and other providers behind a common interface.
require 'sequel'
require 'pgvector'
DB = Sequel.connect('postgres://postgres@pgvector:5432/rubynlp')
DB.run('CREATE EXTENSION IF NOT EXISTS vector')
DB.run('CREATE TABLE IF NOT EXISTS documents (id SERIAL PRIMARY KEY, content TEXT, embedding vector(384))')
class Document < Sequel::Model
plugin :pgvector, :embedding
end
similar = Document.nearest_neighbors(:embedding, query_vector, distance: 'euclidean').limit(5)After bulk inserts, add an HNSW index for approximate nearest-neighbor queries:
DB.add_index :documents, :embedding, type: 'hnsw', opclass: 'vector_l2_ops'require 'ohm'
class LLMResponse < Ohm::Model
attribute :prompt
attribute :response
attribute :model
index :prompt
def self.cached_or_fetch(prompt, model:)
cached = find(prompt: prompt, model: model).first
return cached.response if cached
result = Langchain::LLM::OpenAI.new(api_key: ENV['OPENAI_API_KEY'])
.chat(messages: [{ role: 'user', content: prompt }])
create(prompt: prompt, response: result.chat_completion, model: model).response
end
end- Add or update gems — Edit
nlp/Gemfile, runbundle update, rebuild withrake build/nlp. Commit the updatedGemfile.lock. - Change Ruby version — Update
FROM rubylang/ruby:<version>-jammyinnlp/Containerfileand.ruby-versiontogether. - Python packages — Edit
pip installlines inbase/Containerfile; rebuild both images after. - CUDA support — Pass
--build-arg CUDA_SUPPORT=truetobase/Containerfileto switch all PyTorch installs to CUDA 12.1 wheels. - Jupyter config — Edit
base/jupyter_server_config.py. Remote access is disabled by default; setc.ServerApp.allow_remote_access = Trueto enable.
Build fails — Ensure 4 GB+ available memory. Check nlp/Gemfile.lock for conflicting constraints.
IRuby kernel missing — Verify iruby register --force ran during build:
podman logs notebook-nlp | grep irubyPort conflicts — Change the host-side port in compose.yaml (e.g., "8889:8888") if 8888 is occupied.
SELinux volume errors (Fedora/RHEL) — The :Z flag in compose.yaml handles relabeling automatically. For manual podman run, append :Z to volume mounts.
Database not ready — pgvector has a healthcheck; nlp-notebook won't start until it passes. Check with:
podman-compose ps
podman exec -it redis redis-cli ping # should return PONGFork the repository and open a pull request. Include rake build/nlp output confirming a successful local build. Update Gemfile.lock if changing gem dependencies.
MIT. Base images carry their own licenses; see Jupyter Docker Stacks for details.