Skip to content

CodeBoarding/CodeBoarding

Folders and files

NameName
Last commit message
Last commit date

Latest commit

ย 

History

88 Commits
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 

Repository files navigation

CodeBoarding Logo CodeBoarding

Support TypeScript 3 Projects Supports Python 3 Projects License: MIT Site: CodeBoarding.org Join us on Discord

CodeBoarding is an open-source codebase analysis tool that generates high-level diagram representations of codebases using static analysis and LLM agents, that humans and agents can interact with.
Itโ€™s designed to support onboarding, documentation, and comprehension for large, complex systems.

  • Extract modules and their relationships based on the control flow graph of the project.
  • Builds different levels of abstraction with an LLM agent (multi-provider support) using remote or local inference.
  • Outputs interactive diagrams (Mermaid.js) for integration into docs, IDEs, CI/CD.

๐Ÿ“„ Existing visual generations: GeneratedOnBoardings
๐ŸŒ Try for your open-source project: www.codeboarding.org/demo

๐Ÿงฉ How it works

For detailed architecture information, see our diagram documentation.

graph LR
    API_Service["API Service"]
    Job_Database["Job Database"]
    Orchestration_Engine["Orchestration Engine"]
    Repository_Manager["Repository Manager"]
    Static_Analysis_Engine["Static Analysis Engine"]
    AI_Interpretation_Layer["AI Interpretation Layer"]
    Output_Generation_Engine["Output Generation Engine"]
    API_Service -- "Initiates Job" --> Job_Database
    API_Service -- "Triggers Analysis" --> Orchestration_Engine
    Orchestration_Engine -- "Manages Job State" --> Job_Database
    Orchestration_Engine -- "Requests Code" --> Repository_Manager
    Repository_Manager -- "Provides Code" --> Orchestration_Engine
    Orchestration_Engine -- "Requests Static Analysis" --> Static_Analysis_Engine
    Static_Analysis_Engine -- "Provides Analysis Results" --> Orchestration_Engine
    Orchestration_Engine -- "Feeds Data" --> AI_Interpretation_Layer
    AI_Interpretation_Layer -- "Returns Insights" --> Orchestration_Engine
    AI_Interpretation_Layer -- "Queries Diff" --> Repository_Manager
    Orchestration_Engine -- "Passes Final Insights" --> Output_Generation_Engine
    Output_Generation_Engine -- "Delivers Documentation" --> API_Service
    click API_Service href "https://github.com/CodeBoarding/CodeBoarding/blob/main/.codeboarding/API_Service.md" "Details"
    click Job_Database href "https://github.com/CodeBoarding/CodeBoarding/blob/main/.codeboarding/Job_Database.md" "Details"
    click Repository_Manager href "https://github.com/CodeBoarding/CodeBoarding/blob/main/.codeboarding/Repository_Manager.md" "Details"
    click Static_Analysis_Engine href "https://github.com/CodeBoarding/CodeBoarding/blob/main/.codeboarding/Static_Analysis_Engine.md" "Details"
    click AI_Interpretation_Layer href "https://github.com/CodeBoarding/CodeBoarding/blob/main/.codeboarding/AI_Interpretation_Layer.md" "Details"
    click Output_Generation_Engine href "https://github.com/CodeBoarding/CodeBoarding/blob/main/.codeboarding/Output_Generation_Engine.md" "Details"
Loading

๐Ÿ“Œ Setup

Setup the environment:

uv venv --python 3.11
uv pip sync requirements.txt

Language Scanning:

MacOS:

brew install cmake pkg-config icu4c
gem install github-linguist
  • Download xcode in the Appstore to compile in C++

Linux:

sudo apt-get install build-essential cmake pkg-config libicu-dev zlib1g-dev libcurl4-openssl-dev libssl-dev ruby-dev
gem install github-linguist

Installing langservers for different technologies:

pip install pyright # Python
npm install --save typescript-language-server typescript # Typescript

Important

After installing the dependencies and servers you wanted to use, please update the configuration to use them. For e.g. github-linguist this means pointing to the executable. This configuration can be found in the static_analysis_config.yml file.

Environment Variables

# LLM Provider (choose one)
OPENAI_API_KEY=                 
ANTHROPIC_API_KEY=                 
GOOGLE_API_KEY=                  
AWS_BEARER_TOKEN_BEDROCK=
OLLAMA_BASE_URL=
OPENAI_BASE_URL=                   # Optional: Custom OpenAI endpoint     

# Core Configuration
CACHING_DOCUMENTATION=false        # Enable/disable documentation caching
REPO_ROOT=./repos                  # Directory for downloaded repositories
ROOT_RESULT=./results              # Directory for generated outputs
PROJECT_ROOT=/path/to/CodeBoarding # Source project root (must end with /CodeBoarding)
DIAGRAM_DEPTH_LEVEL=1              # Max depth level for diagram generation
STATIC_ANALYSIS_CONFIG=./static_analysis_config.yml # Path to static analysis config

# Optional
GITHUB_TOKEN=                     # For accessing private repositories
LANGSMITH_TRACING=false           # Optional: Enable LangSmith tracing
LANGSMITH_ENDPOINT=               # Optional: LangSmith endpoint
LANGSMITH_PROJECT=                # Optional: LangSmith project name
LANGCHAIN_API_KEY=                # Optional: LangChain API key

๐Ÿ’ก Tip: Our experience has shown that using Google Geminiโ€‘2.5โ€‘Pro yields the best results for complex diagram generation tasks.

Run it

python demo.py <github_repo_url> --output-dir <output_path>

๐Ÿ–ฅ๏ธ Examples:

We have visualized over 800+ popular open-source projects. See examples:

PyTorch:

graph LR
    Tensor_Operations_Core["Tensor Operations & Core"]
    Automatic_Differentiation_Autograd_Engine_["Automatic Differentiation (Autograd Engine)"]
    Neural_Network_Modules_torch_nn_["Neural Network Modules (torch.nn)"]
    Optimizers_torch_optim_["Optimizers (torch.optim)"]
    Data_Utilities_torch_utils_data_["Data Utilities (torch.utils.data)"]
    JIT_Compiler_Scripting_TorchScript_["JIT Compiler & Scripting (TorchScript)"]
    Hardware_Backends["Hardware Backends"]
    Data_Utilities_torch_utils_data_ -- "provides data to" --> Tensor_Operations_Core
    Tensor_Operations_Core -- "provides primitives for" --> Neural_Network_Modules_torch_nn_
    Tensor_Operations_Core -- "leverages" --> Hardware_Backends
    Neural_Network_Modules_torch_nn_ -- "performs operations on" --> Tensor_Operations_Core
    Neural_Network_Modules_torch_nn_ -- "operations recorded by" --> Automatic_Differentiation_Autograd_Engine_
    Neural_Network_Modules_torch_nn_ -- "exported to" --> JIT_Compiler_Scripting_TorchScript_
    Automatic_Differentiation_Autograd_Engine_ -- "computes gradients for" --> Optimizers_torch_optim_
    Optimizers_torch_optim_ -- "updates parameters of" --> Neural_Network_Modules_torch_nn_
    Hardware_Backends -- "executes computations for" --> Tensor_Operations_Core
    click Tensor_Operations_Core href "https://github.com/CodeBoarding/GeneratedOnBoardings/blob/main/pytorch/Tensor_Operations_Core.md" "Details"
    click Automatic_Differentiation_Autograd_Engine_ href "https://github.com/CodeBoarding/GeneratedOnBoardings/blob/main/pytorch/Automatic_Differentiation_Autograd_Engine_.md" "Details"
    click Neural_Network_Modules_torch_nn_ href "https://github.com/CodeBoarding/GeneratedOnBoardings/blob/main/pytorch/Neural_Network_Modules_torch_nn_.md" "Details"
    click Optimizers_torch_optim_ href "https://github.com/CodeBoarding/GeneratedOnBoardings/blob/main/pytorch/Optimizers_torch_optim_.md" "Details"
    click Data_Utilities_torch_utils_data_ href "https://github.com/CodeBoarding/GeneratedOnBoardings/blob/main/pytorch/Data_Utilities_torch_utils_data_.md" "Details"
    click JIT_Compiler_Scripting_TorchScript_ href "https://github.com/CodeBoarding/GeneratedOnBoardings/blob/main/pytorch/JIT_Compiler_Scripting_TorchScript_.md" "Details"
    click Hardware_Backends href "https://github.com/CodeBoarding/GeneratedOnBoardings/blob/main/pytorch/Hardware_Backends.md" "Details"
Loading

FastAPI:

graph LR
    Application_Core["Application Core"]
    Middleware["Middleware"]
    Routing["Routing"]
    Request_Handling_Validation["Request Handling & Validation"]
    Dependency_Injection["Dependency Injection"]
    Security["Security"]
    Response_Handling["Response Handling"]
    API_Documentation["API Documentation"]
    Application_Core -- " sends request to " --> Middleware
    Middleware -- " forwards request to " --> Routing
    Routing -- " uses " --> Request_Handling_Validation
    Routing -- " uses " --> Dependency_Injection
    Routing -- " provides data for " --> Response_Handling
    Dependency_Injection -- " enables " --> Security
    Response_Handling -- " sends response to " --> Middleware
    API_Documentation -- " inspects " --> Routing
    API_Documentation -- " inspects " --> Request_Handling_Validation
    click Application_Core href "https://github.com/CodeBoarding/GeneratedOnBoardings/blob/main/fastapi/Application_Core.md" "Details"
    click Middleware href "https://github.com/CodeBoarding/GeneratedOnBoardings/blob/main/fastapi/Middleware.md" "Details"
    click Routing href "https://github.com/CodeBoarding/GeneratedOnBoardings/blob/main/fastapi/Routing.md" "Details"
    click Request_Handling_Validation href "https://github.com/CodeBoarding/GeneratedOnBoardings/blob/main/fastapi/Request_Handling_Validation.md" "Details"
    click Dependency_Injection href "https://github.com/CodeBoarding/GeneratedOnBoardings/blob/main/fastapi/Dependency_Injection.md" "Details"
    click Security href "https://github.com/CodeBoarding/GeneratedOnBoardings/blob/main/fastapi/Security.md" "Details"
    click API_Documentation href "https://github.com/CodeBoarding/GeneratedOnBoardings/blob/main/fastapi/API_Documentation.md" "Details"
Loading

ChatTTS:

graph LR
    ChatTTS_Core_Orchestrator["ChatTTS Core Orchestrator"]
    Text_Processing_Module["Text Processing Module"]
    Speech_Synthesis_Models["Speech Synthesis Models"]
    Velocity_Inference_Engine["Velocity Inference Engine"]
    System_Utilities_Configuration["System Utilities & Configuration"]
    ChatTTS_Core_Orchestrator -- " Orchestrates Text Flow " --> Text_Processing_Module
    ChatTTS_Core_Orchestrator -- " Receives Processed Text " --> Text_Processing_Module
    ChatTTS_Core_Orchestrator -- " Orchestrates Synthesis Flow " --> Speech_Synthesis_Models
    ChatTTS_Core_Orchestrator -- " Receives Audio Output " --> Speech_Synthesis_Models
    ChatTTS_Core_Orchestrator -- " Initializes & Configures " --> System_Utilities_Configuration
    ChatTTS_Core_Orchestrator -- " Loads Assets " --> System_Utilities_Configuration
    Text_Processing_Module -- " Receives Raw Text " --> ChatTTS_Core_Orchestrator
    Text_Processing_Module -- " Provides Processed Text " --> ChatTTS_Core_Orchestrator
    Speech_Synthesis_Models -- " Receives Processed Data " --> ChatTTS_Core_Orchestrator
    Speech_Synthesis_Models -- " Generates Audio Output " --> ChatTTS_Core_Orchestrator
    Speech_Synthesis_Models -- " Delegates Inference To " --> Velocity_Inference_Engine
    Speech_Synthesis_Models -- " Receives Inference Results " --> Velocity_Inference_Engine
    Speech_Synthesis_Models -- " Utilizes GPU Resources " --> System_Utilities_Configuration
    Speech_Synthesis_Models -- " Accesses Model Config " --> System_Utilities_Configuration
    Velocity_Inference_Engine -- " Executes Model Inference " --> Speech_Synthesis_Models
    Velocity_Inference_Engine -- " Returns Inference Output " --> Speech_Synthesis_Models
    Velocity_Inference_Engine -- " Receives Engine Configuration " --> System_Utilities_Configuration
    System_Utilities_Configuration -- " Provides Assets & Config " --> ChatTTS_Core_Orchestrator
    System_Utilities_Configuration -- " Provides GPU & Config " --> Speech_Synthesis_Models
    System_Utilities_Configuration -- " Provides Engine Config " --> Velocity_Inference_Engine
    click ChatTTS_Core_Orchestrator href "https://github.com/CodeBoarding/GeneratedOnBoardings/blob/main//ChatTTS/ChatTTS_Core_Orchestrator.md" "Details"
    click Text_Processing_Module href "https://github.com/CodeBoarding/GeneratedOnBoardings/blob/main//ChatTTS/Text_Processing_Module.md" "Details"
    click Speech_Synthesis_Models href "https://github.com/CodeBoarding/GeneratedOnBoardings/blob/main//ChatTTS/Speech_Synthesis_Models.md" "Details"
    click Velocity_Inference_Engine href "https://github.com/CodeBoarding/GeneratedOnBoardings/blob/main//ChatTTS/Velocity_Inference_Engine.md" "Details"
    click System_Utilities_Configuration href "https://github.com/CodeBoarding/GeneratedOnBoardings/blob/main//ChatTTS/System_Utilities_Configuration.md" "Details"
Loading

Browse more examples: GeneratedOnBoardings Repository

๐Ÿš€ Integrations

Codeboarding is integrated with everything we use:

  • ๐Ÿ“ฆ VS Code Extension: Interact with the diagram directly in your IDE.
  • โš™๏ธ GitHub Action: Automate diagram generation in CI/CD.
  • ๐Ÿ”— MCP Server: Serves the consize documentation to your AI Agent assistant (ClaudeCode, VSCode, Cursor, etc.)

๐Ÿค Contributing

Weโ€™re just getting started and would love your help! If you have ideas, spot bugs, or want to improve anything - please open an issue or tackle an existing one. We actively track suggestions and welcome pull requests of all sizes.

๐Ÿ”ฎ Vision

Unified high-level representation for codebases that is accurate (hence static analysis). This representation is used by both people and agents โ†’ fully integrated in IDEs, MCP servers, and development workflows.