Discofork is a local-first terminal application for evaluating GitHub forks of an open source project.
It uses:
ghfor GitHub metadata and fork discoverygitfor cloning, fetching, and structured diff factscodexCLI for repository and fork interpretation- OpenTUI + TypeScript for the interactive terminal interface
The goal is practical fork selection, not raw diff archaeology. Discofork gathers compact facts first, then asks Codex to explain what those facts mean.
Large fork graphs are noisy. Most forks are inactive, archived, or exact mirrors. The useful question is usually:
- what the upstream project does
- which forks are actually maintained
- which forks add meaningful behavior
- which fork is best for stability, features, or experimentation
Discofork is built around that decision workflow.
- Paste a GitHub repository URL directly into the TUI
- Discover forks with local
gh - Ignore archived forks by default
- Limit discovery sensibly when the fork count is huge
- Preselect more promising forks using recency and star-based heuristics
- Clone upstream and selected forks locally with
git - Compare forks to upstream without dumping giant raw diffs into the model
- Run structured, schema-constrained
codex execprompts - Persist prompts, raw Codex output, logs, and exports under
.discofork/ - Reuse cached upstream and fork analyses when neither upstream nor fork has changed since the last run
- Delete temporary cloned repositories after each analysis run to avoid local disk bloat
- Export a machine-readable JSON report and a human-readable Markdown report
- Bun 1.3+
- Git
- GitHub CLI (
gh) - Codex CLI (
codex)
OpenTUI is currently Bun-first, so Discofork uses Bun for runtime and tests.
bun installIf you publish Discofork to npm, it can also be launched with:
npx discofork --helpThe npm executable still requires Bun on PATH, because the runtime entrypoint is executed through Bun.
For a one-shot installer similar to curl ... | bash, use:
curl -fsSL https://discofork.ai/install.sh | bashThe installer:
- installs Bun if it is missing
- downloads Discofork into
~/.local/share/discofork - installs runtime dependencies with Bun
- installs
ghif it is missing - installs
codexif it is missing - creates a
discoforklauncher in~/.local/bin
Installer support notes:
- supports macOS and Linux
- supports
arm64andamd64 - installs
ghinto the selected user-local bin directory - installs
codexinto the selected user-local bin directory - does not currently target Windows
You can also target a specific ref:
curl -fsSL https://discofork.ai/install.sh | bash -s -- --ref mainStart the interactive app:
discoforkOptionally prefill a repository:
discofork --repo cli/go-ghUseful flags:
discofork --repo cli/go-gh --fork-scan-limit 60
discofork --repo cli/go-gh --include-archived
discofork --repo cli/go-gh --recommended-fork-limit 8
discofork --repo cli/go-gh --compare-concurrency 3Environment check:
bun run start -- doctor- Enter a GitHub repository URL or
owner/name. - Discofork discovers upstream metadata and scans forks with
gh. - Before discovery, choose whether default picks should favor highest-star forks or most recent forks.
- The selection screen shows a filtered fork list, filters unchanged forks before filling the visible scan window, and preselects candidates using the chosen mode.
- Press
Enterto analyze the selected forks. - Discofork clones locally, compares against upstream, runs Codex interpretation, and exports the report.
Comparison work runs with bounded parallelism and defaults to 3 forks in flight at once. Override it with --compare-concurrency.
Key controls:
Enter: discover or analyzeShift+Tab: toggle highest-star vs most-recent defaults on the input screenj/k: move in fork listsspace: toggle a fork/: focus the filter inputa: restore the default recommended selectionc: clear selectionu: return from results to selectionqorCtrl+C: quit
Run a local environment check with:
discofork doctoror:
bun run start -- doctorIt checks:
- Bun, Git, GitHub CLI, and Codex CLI availability
ghlogin state- current GitHub core API rate limit
- Codex login state
If you are developing from source instead of using the installed launcher, the equivalent command is:
bun run start -- doctorEvery run writes artifacts under .discofork/.
.discofork/
cache/
logs/
repos/
runs/<run-id>/
codex/
reports/
analysis.json
analysis.md
Notable files:
cache/<repo>/upstream.json: cached upstream facts and summary keyed by upstream freshnesscache/<repo>/forks/<fork>.json: cached per-fork diff facts and interpretation keyed by both upstream and fork freshnessrepos/<repo>/...: temporary clone workspace used during active analysis and deleted afterwardslogs/<run-id>.jsonl: structured app logsruns/<run-id>/codex/.../prompt.md: exact Codex prompt usedruns/<run-id>/codex/.../output.json: final schema-constrained Codex outputruns/<run-id>/reports/analysis.json: machine-readable final reportruns/<run-id>/reports/analysis.md: human-readable final report
Discofork intentionally avoids sending giant raw diffs to Codex.
Instead it gathers:
- upstream metadata and README/manifests
- top-level tree structure
- recent commit subjects
- ahead/behind counts
- changed-file counts, insertions, deletions, rename counts
- top changed directories and file types
- a compact list of the most significant changed files
Those structured facts are then passed into Codex with JSON Schema constraints.
Discofork also keeps an internal cache. If the upstream repository and a fork both have the same pushedAt snapshot as a previous run, Discofork reuses the stored facts and analysis instead of cloning, diffing, and prompting Codex again. Temporary clone directories are removed after each run, so the cache is what persists.
If Codex fails, Discofork falls back to deterministic heuristic summaries so the run still completes.
The website now has a real backend shape:
GET /api/repo/:owner/:repois the backend boundary for repo lookup- the backend checks Postgres for a cached report first
- if no cached report exists, it writes or refreshes a queued row in Postgres
- it then enqueues the repo in Redis with dedupe, so the same repo is not queued repeatedly
- the frontend route fetches that backend endpoint instead of touching Postgres or Redis directly
If DATABASE_URL and REDIS_URL are not set for the web app, it falls back to the existing mock/demo data.
A separate worker process now exists for backend processing:
bun run workerThe worker:
- watches the Redis queue for repo jobs
- runs Discofork discovery and analysis for the queued repo
- stores the final report JSON in Postgres
- marks failures in Postgres so the web backend can surface queued vs failed vs ready state
Important environment variables for the worker:
DATABASE_URLREDIS_URLDISCOFORK_FORK_SCAN_LIMIToptional, defaults to25DISCOFORK_RECOMMENDED_FORK_LIMIToptional, defaults to6DISCOFORK_COMPARE_CONCURRENCYoptional, defaults to3
Run database migrations with:
bun run migrateMigrations live in migrations/.
Run checks:
bun run checkRun tests:
bun testTypecheck only:
bun run typecheckThe website is a separate app under web/.
For Railway:
- Create a web service from this repo with Root Directory set to
/web. - Keep the service config in
web/railway.toml. - Add PostgreSQL and Redis as separate Railway template services.
Railway's config-as-code applies to a single service deployment, so the TOML config handles the web app itself, while Redis and Postgres should be provisioned as managed services in the same Railway project.
CLI example:
railway deploy -t postgres
railway deploy -t redisOnce those exist, connect them to the web service with environment variables such as:
DATABASE_URLREDIS_URL
For the worker, create a separate service from the repo root and run:
bun run workerBefore first use, run the root migration command once against the target Postgres database:
bun run migrateA Railway-ready worker container is included at:
The worker container installs:
- Bun
- Git
- GitHub CLI (
gh) - Codex CLI
Required worker environment variables:
DATABASE_URLREDIS_URLGH_TOKENorGITHUB_TOKENOPENAI_API_KEY
Optional worker tuning:
DISCOFORK_FORK_SCAN_LIMITDISCOFORK_RECOMMENDED_FORK_LIMITDISCOFORK_COMPARE_CONCURRENCY
The container startup script will:
- verify required env vars
- authenticate Codex from
OPENAI_API_KEYif needed - run
bun run migrate - start
bun run worker
On Railway, create a separate service from the repo root and point it at Dockerfile.worker.
Checked-in example exports live in:
examples/cli-go-gh.analysis.mdexamples/cli-go-gh.analysis.json
These were produced from a real smoke run against cli/go-gh.
Current defaults are intentionally pragmatic:
- archived forks are hidden unless explicitly included
- large fork networks are sampled instead of exhaustively analyzed
- recommended forks are chosen using maintenance/activity signals before deep analysis
- fork analysis is bounded and conservative to keep local execution understandable
This project is meant to be useful first. It is not trying to be a full GitHub mining platform.