Skip to content

barbaric7/repomind

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

11 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

RepoMind

Transform any GitHub repository into a structured knowledge graph.

RepoMind is a multi-language code intelligence engine that analyzes source code using compiler techniques, builds semantic relationships between code entities, and provides AI-powered insights through GraphRAG.

Unlike traditional AI code assistants that rely primarily on large language models, RepoMind extracts structured information directly from source code using Abstract Syntax Trees (ASTs), static analysis, and graph-based representations. The LLM is used only as a reasoning layer on top of verified program knowledge.


Vision

Modern repositories are too large to understand by reading files manually.

RepoMind aims to make any codebase explorable by converting source code into structured knowledge.

Given a repository, RepoMind will:

  • Understand project architecture
  • Build dependency and call graphs
  • Detect dead code and technical debt
  • Generate onboarding documentation
  • Explain execution flow
  • Build a semantic knowledge graph
  • Enable natural language querying through GraphRAG

The long-term goal is to provide compiler-level understanding combined with AI-assisted reasoning.


Core Principles

  • Structure before intelligence — Every feature produces structured data before involving an LLM.
  • Language-agnostic architecture — Support multiple programming languages through a unified parsing pipeline.
  • Knowledge-first design — Source code is transformed into entities, relationships, and graphs rather than treated as plain text.
  • Extensible by design — New languages, analyzers, and graph backends can be added without changing the core architecture.

Architecture

Repository
      │
      ▼
Repository Scanner
      │
      ▼
Language Detection
      │
      ▼
Parser Factory
      │
      ▼
Tree-sitter Parser
      │
      ▼
AST Generation
      │
      ▼
Entity Extraction
      │
      ▼
Semantic Analysis
      │
      ▼
Knowledge Graph
      │
      ▼
Embeddings
      │
      ▼
GraphRAG
      │
      ▼
LLM Reasoning

Planned Features

Repository Analysis

  • Repository cloning
  • Recursive file indexing
  • Multi-language support
  • Incremental repository analysis

Parsing & Semantic Analysis

  • AST generation using Tree-sitter
  • Function extraction
  • Class extraction
  • Import analysis
  • Variable extraction
  • Call graph generation
  • Type relationships
  • Symbol references

Static Analysis

  • Dead code detection
  • Circular dependency detection
  • Unused imports
  • Unused variables
  • Complexity analysis
  • Technical debt estimation
  • Maintainability metrics

Knowledge Graph

  • Entity extraction
  • Relationship extraction
  • Neo4j integration
  • Dependency graph
  • Call graph
  • Architecture graph

AI Layer

  • GraphRAG
  • Repository Q&A
  • Architecture explanation
  • Documentation generation
  • Onboarding guides
  • Bug explanation
  • Code summarization

Technology Stack

Frontend

  • Next.js
  • React
  • TypeScript
  • Tailwind CSS

Backend

  • FastAPI
  • Python

Parsing

  • Tree-sitter
  • Tree-sitter Queries

Storage

  • PostgreSQL
  • Neo4j
  • Qdrant (planned)

AI

  • GraphRAG
  • OpenAI / Local LLMs
  • Embedding Models

Infrastructure

  • Docker
  • Kubernetes
  • GitHub Actions

Repository Structure

repomind/

├── frontend/
├── backend/
│   ├── app/
│   │   ├── ast/
│   │   ├── builders/
│   │   ├── entities/
│   │   ├── models/
│   │   ├── parsers/
│   │   ├── routers/
│   │   ├── services/
│   │   └── utils/
│   └── repos/
├── docs/
├── docker/
└── scripts/

Development Philosophy

RepoMind follows a layered architecture.

Each layer has a single responsibility.

  • Parsers understand syntax.
  • Builders convert syntax into entities.
  • Analyzers discover relationships.
  • Graph engines organize knowledge.
  • LLMs reason over structured information.

This separation makes the platform easier to extend, test, and maintain while keeping AI grounded in verified program structure.


Roadmap

  • Repository cloning
  • Recursive file tree generation
  • Language detection
  • Tree-sitter integration
  • Multi-language parser framework
  • Entity extraction
  • Semantic analysis
  • Knowledge graph generation
  • Static analysis engine
  • Embedding pipeline
  • GraphRAG
  • AI-powered repository assistant

Contributing

RepoMind is currently under active development.

Contributions, discussions, and suggestions are welcome as the architecture evolves.


License

This project is licensed under the MIT License.

About

Transform any GitHub repository into a structured knowledge graph.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors