Skip to content

kanhaiyaray/Codebase-Explorer

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

🔭 Codebase Explorer

Explore any public GitHub repository with AI-powered analysis — architecture graphs, dependency maps, code smell detection, tech stack auto-detection, and a persistent multi-turn AI chat. Runs fully in the browser or with an optional Node.js backend for authentication, saved analyses, and encrypted API key storage.

React Vite Node License


📋 Table of Contents


Overview

Codebase Explorer v4 is a zero-backend, browser-native tool for understanding any public GitHub repository at a glance. Point it at an owner/repo, and it instantly renders the folder structure, language distribution, dependency graph, code smell warnings, and an AI chat assistant — no server, no login, no install beyond npm run dev.


Features

🗺 Architecture View

Radial folder graph — the repo at the centre, top-level directories on an inner ring, key subdirectories on an outer ring. Node size reflects file count. Rendered as pure SVG with no third-party graph library.

🔗 Import Dependency Graph

  • Fetches up to 60 JS/TS/Python source files and parses import, require, export … from, and from … import statements
  • Resolves relative imports to absolute repo paths
  • Circular dependency detection via depth-first search — highlighted in red
  • Orphan file detection — files referenced by nobody
  • Progress indicator during parallel batch fetching (4 concurrent requests)
  • Hover any node for its name, edge count, and flags

🔍 Code Smell Detector

Runs entirely in the browser using only file path and size metadata — no file content downloaded.

Smell Threshold Severity
Large file > 50 KB Medium; > 150 KB → High
Deep nesting > 5 folder levels Low
Duplicate filenames Same name in ≥ 2 dirs (excluding intentional ones like index.js) Low
Crowded root > 12 files directly in root Medium

🎨 Tech Stack Auto-Detection

Two-pass detection from file extensions and package.json dependencies. Detects framework, primary language, build tool, state management, styling approach, and test framework. Supports 30+ frameworks including Next.js, Nuxt, Remix, Vue, Angular, Svelte, Django, FastAPI, Rails, and more.

💬 Persistent AI Chat Panel

  • File mode — full file content sent as context
  • Codebase mode — all AI-generated per-file summaries injected as context
  • Four explain styles: ELI5, Standard, Senior, Interview
  • Quick-action presets — one-click prompts like "Summarise this file" or "Find bugs"
  • Supports Anthropic Claude and Groq / LLaMA with automatic model fallback
  • Retry logic for HTTP 429, 503, 502, 529; 30-second timeout via AbortController

🧠 AI Codebase Memory

Every file you open is silently summarised by the AI and stored in memory. Switch to Codebase mode to ask cross-file questions — the more files you browse, the richer the answers.

💻 File Viewer

Line-numbered code display. Binary files (images, fonts, compiled artifacts) are detected by extension and show a direct GitHub link instead of garbled text.

📊 Language Stats

Horizontal bar chart of language distribution. Relative bar widths ensure every language — even tiny ones — is visible.

📂 Collapsible File Tree Sidebar

  • 🚀 Entry-point markers on files like index.js, main.tsx, app.py, server.js
  • File-type pill badges (JS, TS, PY, RS, …)
  • Alphabetically sorted, expand/collapse per directory

Screenshots

Screenshots are located in Public/:

File Description
codebase explorer-1.jpg Architecture radial graph view
codebase explorer-2.jpg Dependency graph + AI chat
codebase explorer-3.jpg Code smell panel + language stats

Quick Start

Frontend only (no backend, no login)

# 1. Clone or unzip
cd codebase-explorer

# 2. Install dependencies
npm install

# 3. Configure environment
cp .env.example .env
# Add at least one AI key — see Environment Variables below

# 4. Start the Vite dev server
npm run dev:client
# Open http://localhost:5173

Full platform (frontend + backend + GitHub OAuth)

# Start both servers concurrently
npm run dev
# Frontend → http://localhost:5173
# Backend  → http://localhost:8787

Try a repository

Enter any of the following in the URL bar:

facebook/react
vercel/next.js
https://github.com/expressjs/express

Environment Variables

Copy .env.example to .env and fill in the values you need.

Frontend (Vite — prefix VITE_)

Variable Required Description
VITE_GITHUB_TOKEN Optional GitHub PAT — raises rate limit from 60 → 5,000 req/hr. Only public repo read scope needed. Create at github.com/settings/tokens.
VITE_ANTHROPIC_API_KEY At least one AI key Claude API key from console.anthropic.com.
VITE_GROQ_API_KEY At least one AI key Groq API key from console.groq.com. Free tier available with very low latency.
VITE_API_BASE_URL Optional Override the backend URL (default: http://localhost:8787).

Backend (Node.js server)

Variable Required Description
PORT Optional Backend port (default: 8787).
APP_ORIGIN Optional Public origin of the backend (default: auto-detected from request host).
FRONTEND_ORIGIN Optional Frontend origin for CORS (default: http://localhost:5173).
GITHUB_CLIENT_ID OAuth only GitHub OAuth App Client ID.
GITHUB_CLIENT_SECRET OAuth only GitHub OAuth App Client Secret.
GITHUB_TOKEN Optional Server-side GitHub PAT used for repo/file proxy calls.
ANTHROPIC_API_KEY Optional Server-side Anthropic key (supplements or replaces user keys).
GROQ_API_KEY Optional Server-side Groq key.
PLATFORM_ENCRYPTION_SECRET Recommended 32-char secret used to AES-256-GCM encrypt stored user API keys.

Note on frontend API keys: Vite embeds all VITE_* variables into the bundle at build time. They are sent directly to the AI provider from the browser — never proxied through the server — so use keys with the minimum required permissions.


Project Structure

codebase-explorer/
├── index.html                          ← Single HTML shell
├── vite.config.js                      ← Vite + React plugin config
├── package.json
├── .env.example                        ← Environment variable template
├── export.js                           ← Dev utility: dumps source to output.txt
├── scripts/
│   └── dev.js                          ← Concurrent dev launcher (checks ports before spawning)
├── Public/
│   └── codebase explorer-*.jpg         ← App screenshots
├── server/
│   ├── index.js                        ← HTTP server (no framework — raw node:http)
│   ├── data/
│   │   └── platform-db.json            ← JSON file-based database (users, analyses, indexes, etc.)
│   ├── lib/
│   │   ├── db.js                       ← readDb / writeDb / mutateDb helpers
│   │   ├── http.js                     ← readJson, sendJson, parseCookies, redirect, getRequestContext
│   │   ├── jsonStore.js                ← File-based JSON persistence with atomic writes
│   │   ├── rateLimit.js                ← In-memory sliding window rate limiter
│   │   ├── security.js                 ← AES-256-GCM encryption, prompt-injection sanitisation, input validation
│   │   └── session.js                  ← In-memory session store with signed cookies
│   └── services/
│       ├── github.js                   ← Server-side GitHub REST API calls (repo, tree, file content)
│       └── llm.js                      ← Server-side AI proxy (Anthropic + Groq, user key decryption)
└── src/
    ├── main.jsx                        ← React entry point
    ├── App.jsx                         ← ErrorBoundary wrapper
    ├── components/
    │   ├── CodebaseExplorer.jsx         ← 🏠 Root layout & state orchestrator
    │   ├── AIChat.jsx                   ← 💬 Multi-turn chat panel (file + codebase modes)
    │   ├── ArchGraph.jsx                ← 🗺  Radial folder architecture SVG
    │   ├── DependencyGraph.jsx          ← 🔗 Import/require dependency graph
    │   ├── SmellPanel.jsx               ← 🔍 Code smell detection panel
    │   ├── FileViewer.jsx               ← 💻 Line-numbered file content viewer
    │   ├── LangStats.jsx                ← 📊 Language distribution bar chart
    │   ├── WorkspacePanel.jsx           ← ⚙️  Settings, saved analyses, annotations, watches
    │   ├── TreeNode.jsx                 ← 📂 Collapsible sidebar directory node
    │   ├── FileRow.jsx                  ← 📄 Sidebar file row with badges and entry-point markers
    │   └── Badge.jsx                   ← 🏷  File-type pill component
    └── utils/
        ├── constants.js                ← Language map, binary/parseable ext sets, entry-point filenames
        ├── github.js                   ← GitHub REST API (repo metadata, tree, file content, batch fetch)
        ├── llm.js                      ← LLM router: Anthropic + Groq, retry, timeout, multi-turn
        ├── platform.js                 ← Backend API client (session, settings, analyses, annotations, watches)
        ├── treeBuilder.js              ← Flat GitHub tree → nested directory tree; language stats
        ├── dependencyParser.js         ← Import/require/from parser; path resolver; cycle detector
        ├── smellDetector.js            ← Heuristic smell rules (large files, nesting, duplicates, crowded root)
        ├── techStackDetector.js        ← Framework/language/tooling auto-detection
        └── codebaseIndex.js            ← In-memory per-file AI summary store for cross-file chat context

Architecture

Frontend

All UI state lives in one root component — CodebaseExplorer.jsx. Child components receive data via props and are kept stateless where possible. Side effects are isolated in utility modules:

Module Responsibility
github.js All GitHub API interactions; backend proxy with direct fallback
llm.js AI provider routing, retry logic, multi-turn history management
platform.js Backend REST API client for auth, settings, analyses, indexes
dependencyParser.js Regex-based import extraction, path resolution, DFS cycle detection
smellDetector.js Heuristic analysis from file metadata only
techStackDetector.js Two-pass stack detection (extensions + package.json)
codebaseIndex.js In-memory per-file summary store for cross-file AI context

Styling is 100% CSS-in-JS via inline styles — no external CSS library, no class name collisions, no build-time CSS processing. Colour theme is Catppuccin Mocha (dark). Font is JetBrains Mono via Google Fonts.

Backend

A zero-dependency Node.js HTTP server (node:http) with:

  • Session-based authentication — in-memory session store with HttpOnly cookies
  • GitHub OAuth 2.0 flow — CSRF-safe with one-time state tokens and TTL pruning
  • GitHub API proxy — passes the authenticated user's OAuth token for higher rate limits
  • AI proxy — decrypts user-stored keys before forwarding to Anthropic / Groq
  • JSON file database — atomic read-modify-write via server/data/platform-db.json
  • Rate limiting — 30 requests / 60 s per IP/user per route
  • Security headers — CSP, HSTS, X-Content-Type-Options, Referrer-Policy
  • AES-256-GCM encryption — for stored API keys using a configurable secret

How the Dependency Graph Works

  1. All parseable files (.js, .jsx, .ts, .tsx, .mjs, .cjs, .py) are identified from the GitHub tree — up to 60, sorted by path length (shortest first).
  2. File contents are fetched in batches of 4 concurrent requests.
  3. Regex patterns extract raw import strings; only relative imports (starting with .) are kept.
  4. Resolved paths are matched against the known file tree (with extension and index-file fallback).
  5. A DFS traversal detects cycles; flagged paths are highlighted red in the SVG.
  6. Orphan files (never imported by anything) are styled distinctly.

How AI Memory Works

  1. You open a file → the app sends it to the AI in the background: "Summarise this file in 2–3 sentences."
  2. The response is stored: store[repoKey][filePath] = summary.
  3. In Codebase mode, up to 30 summaries are prepended to the prompt as structured context.
  4. The index resets when a new repo is loaded.
  5. If the backend is running and you are signed in, the index is also persisted server-side.

Backend Platform

When the backend is running (npm run dev:server), the following features are unlocked:

Feature Endpoint
GitHub OAuth login GET /api/auth/github/start
Session check GET /api/auth/session
Logout POST /api/auth/logout
User settings GET/PUT /api/user/settings
Encrypted AI keys PUT /api/user/keys
Saved analyses GET/POST /api/user/saved-analyses
Codebase index GET/PUT /api/user/index
File annotations GET/POST /api/user/annotations
Repo watch list GET/POST/DELETE /api/user/watches
GitHub repo proxy GET /api/github/repo
GitHub file proxy GET /api/github/file
AI chat proxy POST /api/ai/chat
Change-impact analysis POST /api/analysis/impact
Code smell analysis POST /api/analysis/smells
Health check GET /api/health

The database is a single JSON file at server/data/platform-db.json. It stores users, analyses, indexes, annotations, watches, and audit logs (capped at 5,000 entries).


Tech Stack

Layer Technology
UI framework React 18
Build tool Vite 7
Styling CSS-in-JS (inline styles) — no external CSS library
Colour theme Catppuccin Mocha (dark)
Font JetBrains Mono via Google Fonts
Data source GitHub REST API v3
AI providers Anthropic Claude API · Groq API
Graphs Hand-written SVG — no graph library
Backend Node.js 20+ (node:http, zero npm dependencies)
Database JSON file store (atomic read-modify-write)
Encryption AES-256-GCM via node:crypto

Build for Production

npm run build     # Compiles and bundles into dist/
npm run preview   # Serves the production build locally for verification

The output is a fully static site — host it on any static file server, CDN, or service like Vercel, Netlify, or GitHub Pages.

Security reminder: Vite embeds all VITE_* variables into the bundle at build time. Do not use API keys with broad permissions in a publicly deployed build.

About

> Explore any public GitHub repository with AI-powered analysis — architecture graphs, dependency maps, code smell detection, tech stack auto-detection, and a persistent multi-turn AI chat. Runs fully in the browser **or** with an optional Node.js backend for authentication, saved analyses, and encrypted API key storage.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors