Skip to content

bsreeram08/codebase2nlm

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

10 Commits
 
 
 
 
 
 
 
 

Repository files navigation

codebase2nlm

Crawl a codebase and emit markdown document(s) for uploading to NotebookLM as sources.

Install (macOS)

One-time setup if you don't have pipx yet:

brew install pipx
pipx ensurepath
# open a new terminal after this

Option A — install from GitHub (recommended)

Public repo:

pipx install git+https://github.com/bsreeram08/codebase2nlm.git

Specific branch, tag, or commit:

pipx install git+https://github.com/bsreeram08/codebase2nlm.git@main
pipx install git+https://github.com/bsreeram08/codebase2nlm.git@v0.1.0
pipx install git+https://github.com/bsreeram08/codebase2nlm.git@<commit-sha>

Private repo (uses your SSH key):

pipx install git+ssh://git@github.com/bsreeram08/codebase2nlm.git

Update to the latest version on the default branch:

pipx upgrade codebase2nlm
# or force a clean reinstall
pipx install --force git+https://github.com/bsreeram08/codebase2nlm.git

Option B — install from a local clone

git clone https://github.com/bsreeram08/codebase2nlm.git
cd codebase2nlm
pipx install .

Uninstall

pipx uninstall codebase2nlm

Usage

# crawl the current directory
codebase2nlm

# crawl a specific project
codebase2nlm ~/projects/myapp

# custom output location
codebase2nlm ~/projects/myapp -o ~/Desktop/myapp-notebooklm

# tweak the per-file word limit
codebase2nlm ~/projects/myapp --max-words 400000

# tweak line and size limits too (NotebookLM-friendly defaults are used automatically)
codebase2nlm ~/projects/myapp --max-lines 95000 --max-mb 190

# ignore .gitignore (still respects .crawlignore and built-in skips)
codebase2nlm ~/projects/myapp --no-gitignore

Output lands in <PATH>/notebooklm_output/ by default. Upload the resulting codebase.md (or each codebase_partNN.md if the codebase was too large for a single source) to NotebookLM.

What it does

  • Honors .gitignore and .crawlignore at the repo root (gitignore syntax).
  • Skips common noise: .git, node_modules, __pycache__, .venv, lockfiles, etc.
  • Lists binary files in the tree tagged (binary — contents omitted), skips their contents.
  • Produces a full ASCII file tree at the top, followed by every text file's contents in labeled code fences.
  • Auto-splits into codebase_part01.md, codebase_part02.md, ... when any NotebookLM per-source limit would be exceeded:
    • word count (default target: 450k to stay below 500k),
    • line count (default target: 95k lines),
    • upload size (default target: 190MB to stay below 200MB).
  • Oversized files are automatically chunked into labeled sections like (chunk 1 of N) instead of being left as a single over-limit section.
  • Warns when output creates more than 50 parts (NotebookLM notebook source count limit).

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages