Repolens

Evaluate GitHub repositories like a senior engineer — with deterministic, explainable scoring.

Features · Architecture · Getting Started · API Reference · Scoring Methodology · Tech Stack · Contributing

Overview

Repolens (previously GitHub Project Analyzer) is a production-grade web application that evaluates GitHub repositories using recruiter-inspired engineering metrics. It produces a deterministic, explainable 0–100 score with a hiring confidence level, risk flags, and optionally AI-enhanced improvement suggestions via Google Gemini.

It simulates how a technical recruiter or senior engineer evaluates a candidate's GitHub project — not by looking at stars or commit count alone, but by analyzing:

README quality — Is the project well-documented? Are there installation instructions, usage examples, architecture sections?
Commit discipline — Are commits spread over time or crammed in one day? Are messages meaningful?
Tech stack maturity — Does the project use testing, linting, TypeScript, CI/CD?
Architectural complexity — Is the codebase well-structured with separation of concerns?

What this is NOT

A GitHub stats viewer (stars, forks, contributors)
A simple API wrapper around GitHub's REST API
An AI-generated score — AI is used only for enhancing suggestion text, never for scoring

Features

Deterministic scoring — Same repository always produces the same score. No randomness, no AI in the scoring pipeline.
Explainability — Every point earned or lost traces back to a specific metric with a specific threshold.
Actionable output — Reports don't just say "your README is bad" — they specify exactly what sections are missing and what to add.
4 independent analyzers — README, Commits, Tech Stack, and Complexity — each producing structured findings, strengths, and suggestions.
Hiring confidence levels — Low (0–40), Moderate (41–70), Strong (71–100) with detailed reasoning.
Risk flag detection — Critical, warning, and info-level flags surfaced from analyzer metadata.
AI-enhanced suggestions (optional) — Google Gemini rewrites raw suggestions into professional, prioritized improvement roadmaps.
Full-stack — API routes + interactive dashboard with charts, animations, and JSON export.
Rate-limit aware — GitHub API client with exponential backoff, retry logic, and rate limit tracking.

Architecture

┌─────────────────────────────────────────────────────────────────────────┐
│                        Frontend (Next.js App Router)                    │
│                                                                         │
│  page.tsx ───────────────────────────────────────────────────────────── │
│  │  SearchBar → POST /api/analyze → Display Results                     │
│  │                                                                      │
│  │  ScoreOverview   RadarChart   CommitTimeline                         │
│  │  CategoryBreakdown   TechStackBadges   ImprovementRoadmap            │
│  │  ExportButton                                                        │
└─────────────┬───────────────────────────────────────────────────────────┘
              │ HTTP POST /api/analyze
              ▼
┌─────────────────────────────────────────────────────────────────────────┐
│                     API Layer (app/api/analyze/route.ts)                │
│                                                                         │
│  1. Parse & validate request body (Zod)                                 │
│  2. Call orchestrator                                                   │
│  3. Return AnalysisReport or structured error                           │
└─────────────┬───────────────────────────────────────────────────────────┘
              │
              ▼
┌─────────────────────────────────────────────────────────────────────────┐
│                     Orchestrator (lib/services/orchestrator.ts)         │
│                                                                         │
│  Stage 1: Resolve repo identifier (URL or owner/repo)                   │
│  Stage 2: Fetch all repo data from GitHub API (parallel)                │
│  Stage 3: Run all 4 analyzers (synchronous, pure functions)             │
│  Stage 4: Compute weighted score + hiring confidence + risk flags       │
│  Stage 5: Optionally enhance with Gemini AI                             │
│  Stage 6: Assemble and return AnalysisReport                            │
└──────┬──────────────┬───────────────┬───────────────┬───────────────────┘
       │              │               │               │
       ▼              ▼               ▼               ▼
┌────────────┐ ┌────────────┐ ┌────────────┐ ┌─────────────────┐
│   GitHub   │ │  Analyzer  │ │  Scoring   │ │   AI Layer      │
│  Service   │ │  Engine    │ │  Engine    │ │   (Optional)    │
│            │ │            │ │            │ │                 │
│ client.ts  │ │ readme     │ │ engine.ts  │ │ geminiClient.ts │
│ repoSvc.ts │ │ commits    │ │ normalizer │ │ enhancer.ts     │
│ types.ts   │ │ techStack  │ │            │ │                 │
│            │ │ complexity │ │ Weights:   │ │ Never affects   │
│ Axios +    │ │            │ │ R:25%      │ │ scoring.        │
│ retry +    │ │ All pure   │ │ C:20%      │ │ Only rewrites   │
│ rate limit │ │ functions  │ │ T:30%      │ │ suggestions.    │
│            │ │            │ │ X:25%      │ │                 │
└────────────┘ └────────────┘ └────────────┘ └─────────────────┘

Data Flow

User enters GitHub URL
  → POST /api/analyze { repoUrl: "https://github.com/owner/repo" }
  → Zod validates input (repoUrl or owner+repo)
  → Orchestrator resolves repo identifier
  → 6 parallel GitHub API fetches (metadata, commits, tree, README, languages, package.json)
  → 4 analyzers run on fetched data (pure functions, no side effects)
  → Scoring engine normalizes + weights results → deterministic 0–100 score
  → (Optional) Gemini AI enhances suggestions into a professional roadmap
  → Final AnalysisReport returned as JSON

Getting Started

Prerequisites

Node.js >= 18.x
npm >= 9.x (or pnpm / yarn)
A GitHub Personal Access Token — Generate one here with public_repo scope (or repo for private repos)
(Optional) A Google Gemini API Key — Get one here for AI-enhanced suggestions

Installation

# 1. Clone the repository
git clone https://github.com/divyanshu12-fullstack/github-analyzer.git
cd github-analyzer

# 2. Install dependencies
npm install

# 3. Set up environment variables
cp .env.example .env.local

Environment Variables

Edit .env.local with your keys:

# Required — GitHub Personal Access Token
GITHUB_TOKEN=ghp_your_token_here

# Optional — Google Gemini API key (enables AI suggestions)
GEMINI_API_KEY=your_gemini_key_here

# Optional — Override log level (trace | debug | info | warn | error | fatal)
LOG_LEVEL=debug

Variable	Required	Description
`GITHUB_TOKEN`	Yes	GitHub PAT with `public_repo` scope. Rate limit: 5,000 req/hr with token, 60/hr without.
`GEMINI_API_KEY`	No	Enables AI suggestion enhancement. App works fully without it (`aiSuggestions: null`).
`LOG_LEVEL`	No	Defaults to `debug` in dev, `info` in production.

Running Locally

# Development server (with hot reload)
npm run dev

# Production build
npm run build
npm start

Open http://localhost:3000 to use the dashboard.

Available Scripts

Script	Command	Description
`npm run dev`	`next dev`	Start development server with hot reload
`npm run build`	`next build`	Create optimized production build
`npm start`	`next start`	Start production server
`npm run lint`	`eslint .`	Run ESLint
`npm run lint:fix`	`eslint . --fix`	Auto-fix lint issues
`npm run format`	`prettier --write .`	Format all files with Prettier
`npm run format:check`	`prettier --check .`	Check formatting without writing
`npm test`	`jest`	Run test suite
`npm run test:watch`	`jest --watch`	Run tests in watch mode
`npm run test:coverage`	`jest --coverage`	Run tests with coverage report

API Reference

`POST /api/analyze`

Analyze a GitHub repository and return a full scoring report.

Request Body:

{ "repoUrl": "https://github.com/vercel/next.js" }

Or alternatively:

{ "owner": "vercel", "repo": "next.js" }

Success Response (200):

{
  "success": true,
  "data": {
    "repoName": "next.js",
    "repoUrl": "https://github.com/vercel/next.js",
    "repoDescription": "The React Framework",
    "repoPrimaryLanguage": "TypeScript",
    "repoStars": 128000,
    "totalScore": 62.1,
    "categoryScores": {
      "readme": { "normalizedScore": 78, "weight": 0.25, "weightedScore": 19.5, "raw": { "..." : "..." } },
      "commits": { "normalizedScore": 55, "weight": 0.20, "weightedScore": 11.0, "raw": { "..." : "..." } },
      "techStack": { "normalizedScore": 70, "weight": 0.30, "weightedScore": 21.0, "raw": { "..." : "..." } },
      "complexity": { "normalizedScore": 42, "weight": 0.25, "weightedScore": 10.6, "raw": { "..." : "..." } }
    },
    "hiringConfidence": { "level": "Moderate", "score": 62.1, "reasoning": "..." },
    "riskFlags": [ { "severity": "warning", "message": "...", "category": "commits" } ],
    "aiSuggestions": null,
    "analyzedAt": "2026-02-20T10:30:00.000Z",
    "analysisVersion": "1.0.0",
    "processingTimeMs": 4523
  }
}

Error Responses:

Status	Condition
`400`	Validation failed (missing/invalid URL)
`404`	Repository not found
`415`	Wrong content type (non-JSON)
`429`	GitHub API rate limit exceeded (includes `Retry-After` header)
`500`	Unexpected server error

`GET /api/health`

Health check endpoint for monitoring and deployment verification.

Response (200):

{
  "status": "ok",
  "version": "1.0.0",
  "timestamp": "2026-02-20T10:30:00.000Z",
  "services": {
    "github": {
      "configured": true,
      "rateLimit": { "remaining": 4985, "limit": 5000, "resetsAt": "2026-02-20T11:00:00.000Z" }
    },
    "gemini": { "configured": true }
  }
}

Scoring Methodology

Category Weights

Category	Weight	Rationale
Tech Stack	30%	Directly reflects engineering maturity — testing, TypeScript, CI/CD, linting
README	25%	A good README signals the developer thinks about users and documentation
Complexity	25%	Structural quality shows software engineering fundamentals
Commits	20%	Important but slightly less since commit patterns can vary by workflow

Hiring Confidence Levels

Score	Level	Meaning
0–40	Low	Project needs significant work before it demonstrates engineering quality
41–70	Moderate	Decent fundamentals with clear room for improvement
71–100	Strong	Production-quality engineering signals across most categories

Analyzer Breakdown

README Analyzer (max 100 pts)

Metric	Points	Description
Presence	5	README exists and is non-empty
Word count	20	Tiered: <100 (5), 100–299 (10), 300–799 (15), 800+ (20)
Required sections	15	5 pts each for Installation, Usage, Features
Bonus sections	12+	3 pts each for Contributing, License, Deployment, Architecture, etc.
Code blocks	15	First block (10), 3+ blocks (+5)
Screenshots/demo	8	Image links or demo URL patterns
Badges	4	Build/coverage badge patterns
Architecture section	8	Heading match for architecture/design/structure

Commit Analyzer (max 100 pts)

Metric	Points	Description
Commit count	15	Tiered: 1–4 (3), 5–14 (7), 15–29 (10), 30+ (15)
Message length	20	Average characters: <10 (0), 10–29 (5), 30–49 (10), 50–72 (20), >72 (15)
Conventional commits	20	Ratio of prefixed messages (`feat:`, `fix:`, `docs:`, etc.)
Temporal spread	25	Unique active days: 1 (5), 2–3 (10), 4–6 (15), 7–13 (20), 14+ (25)
Penalty: Burst	-10	>10 commits in a single day (sliding window)
Penalty: Duplicates	-10	>20% identical commit messages
Penalty: Low-effort	-5	>30% messages matching "update", "fix", "wip", etc.

Tech Stack Analyzer (max 100 pts)

Metric	Points	Description
Framework detection	15	React, Next.js, Express, Vue, Angular, Svelte, etc.
State management	10	Redux, Zustand, Jotai, MobX, Pinia, etc.
Testing	20	Jest, Vitest, Cypress, Playwright + test file detection
Linting	10	ESLint, Prettier + config file presence
TypeScript	15	`typescript` in deps or `tsconfig.json` in tree
Environment config	5	`.env` files in tree
CI/CD	10	`.github/workflows`, `.gitlab-ci.yml`, `Jenkinsfile`, etc.
Containerization	5	`Dockerfile` or `docker-compose.yml`

Outputs a maturity tier: Basic / Intermediate / Advanced

Complexity Analyzer (max 100 pts)

Metric	Points	Description
Folder depth	15	Max nesting: 1–2 (3), 3–4 (7), 5–6 (11), 7+ (15)
File count	10	Code files: <5 (2), 5–19 (5), 20–49 (7), 50+ (10)
Separation of concerns	20	Detects: models, utils, helpers, middleware, hooks, lib, services
API routes	10	Detects: api, routes, graphql, resolvers folders
Authentication	10	Detects: auth, login, signup, session folders
Error handling	10	Detects: error, errors, exception patterns
Config abstraction	10	Detects: config, constants, env, settings folders
Modularity bonus	15	Shannon entropy of file distribution across folders
Penalty: Anti-patterns	-5 each (max 3)	Files >500 lines detected in tree
Penalty: Monolith	-15	Single file >40% of total codebase

Key Principle: Determinism

The scoring pipeline is 100% deterministic. Same repo data always produces the same score. AI is never involved in scoring — it only rewrites suggestion text and generates the improvement roadmap.

Tech Stack

Layer	Technology	Why
Framework	Next.js 16 (App Router)	Full-stack in one project, API routes + React UI
Language	TypeScript 5 (strict)	Type safety across the entire stack
Styling	Tailwind CSS v4	Utility-first, fast iteration
GitHub API	Axios	Interceptors for rate-limit tracking, retry support
Validation	Zod	Runtime type safety on API boundaries, TS inference
Logging	Pino + `pino-pretty`	Fastest Node.js logger, structured JSON in prod
AI	Google Gemini (`@google/generative-ai`)	Suggestion enhancement only, generous free tier
Charts	Recharts	React-native, good radar chart support
Icons	Lucide React	Consistent, tree-shakeable icon set
Animation	Framer Motion	Declarative animations with mount/unmount support
Toasts	Sonner	Beautiful toast notifications, minimal setup
Testing	Jest + `ts-jest`	Full TypeScript support, standard testing stack

Project Structure

github-analyzer/
├── app/
│   ├── api/
│   │   ├── analyze/route.ts          # POST /api/analyze — main analysis endpoint
│   │   └── health/route.ts           # GET /api/health — status check
│   ├── globals.css                    # Global styles & design tokens
│   ├── layout.tsx                     # Root layout with metadata
│   └── page.tsx                       # Dashboard page
├── components/
│   ├── ui/                            # Reusable UI primitives
│   │   ├── Badge.tsx, Button.tsx, Card.tsx,
│   │   ├── Input.tsx, Progress.tsx, Skeleton.tsx
│   └── analyzer/                      # Dashboard-specific components
│       ├── SearchBar.tsx              # GitHub URL input + validation
│       ├── ScoreOverview.tsx          # Score gauge + hiring confidence
│       ├── CategoryBreakdown.tsx      # 4 category cards with details
│       ├── RadarChart.tsx             # Spider diagram of scores
│       ├── CommitTimeline.tsx         # Commit distribution bar chart
│       ├── TechStackBadges.tsx        # Detected tech as pill badges
│       ├── ImprovementRoadmap.tsx     # AI roadmap or raw suggestions
│       └── ExportButton.tsx           # Download analysis as JSON
├── lib/
│   ├── config/
│   │   ├── thresholds.ts             # All scoring weights & point allocations
│   │   └── github.ts                 # GitHub API config, URL regex, cache config
│   ├── services/
│   │   ├── github/
│   │   │   ├── client.ts             # Axios instance, retry, rate limit tracking
│   │   │   ├── repoService.ts        # 6 parallel data fetchers
│   │   │   └── types.ts              # Zod schemas for API responses
│   │   ├── analyzers/
│   │   │   ├── readmeAnalyzer.ts     # README quality (8 metrics, max 100)
│   │   │   ├── commitAnalyzer.ts     # Commit discipline (6 metrics + 3 penalties)
│   │   │   ├── techStackAnalyzer.ts  # Tech maturity (8 metrics + tiers)
│   │   │   └── complexityAnalyzer.ts # Structural complexity (10 metrics + 2 penalties)
│   │   ├── scoring/
│   │   │   ├── engine.ts             # Weighted scoring + confidence + risk flags
│   │   │   └── normalizer.ts         # Score normalization utilities
│   │   ├── ai/
│   │   │   ├── geminiClient.ts       # Gemini SDK wrapper with retry/timeout
│   │   │   └── enhancer.ts           # AI suggestion rewriting + roadmap
│   │   └── orchestrator.ts           # 6-stage analysis pipeline
│   └── utils/
│       ├── errors.ts                  # Custom error classes
│       └── logger.ts                  # Pino structured logging
├── types/
│   ├── github.ts                      # GitHub API TypeScript interfaces
│   └── analysis.ts                    # Analysis/scoring/report interfaces
├── __tests__/                         # Jest test suite
│   ├── analyzers/                     # Analyzer unit tests
│   ├── scoring/                       # Scoring engine tests
│   └── api/                           # API route tests
├── .env.example                       # Environment variable template
├── jest.config.ts                     # Jest + ts-jest configuration
├── tsconfig.json                      # Strict TypeScript + path aliases
├── next.config.ts                     # Next.js configuration
└── package.json                       # Dependencies + scripts

Testing

# Run all tests
npm test

# Watch mode
npm run test:watch

# With coverage report
npm run test:coverage

Tests cover all 4 analyzers, the scoring engine, and API routes with edge cases including empty inputs, maximum scores, penalty triggers, and boundary conditions.

Known Limitations

Last 100 commits only — GitHub API pagination limit per request
File tree may be truncated — Trees API caps at ~100K entries for very large repos
README analysis is pattern-based — Heading keyword matching, not semantic NLP
No runtime analysis — Cannot verify code compiles, runs, or passes tests
Node.js/JS ecosystem focus — package.json based detection; Python/Go/Rust deps not analyzed
Single-repo only — No multi-repo portfolio scoring (designed for future extension)

Contributing

Contributions are welcome! To get started:

Fork the repository
Create a feature branch: git checkout -b feat/my-feature
Make changes and ensure tests pass: npm test
Ensure code is formatted: npm run format
Ensure linting passes: npm run lint
Submit a pull request

Adding a New Analyzer

The architecture is designed for easy extensibility. To add a new analysis category:

Create lib/services/analyzers/yourAnalyzer.ts implementing (input) => AnalyzerResult
Add thresholds to lib/config/thresholds.ts
Register in the orchestrator (lib/services/orchestrator.ts)
Add weight to SCORING_WEIGHTS in thresholds config
Add tests in __tests__/analyzers/

License

This project is open source and available under the MIT License.

Built with Next.js, TypeScript, and a passion for engineering quality.

Name		Name	Last commit message	Last commit date
Latest commit History 14 Commits
.github		.github
.vscode		.vscode
__tests__		__tests__
app		app
components		components
lib		lib
public		public
types		types
.gitignore		.gitignore
.prettierrc		.prettierrc
README.md		README.md
eslint.config.mjs		eslint.config.mjs
jest.config.ts		jest.config.ts
next.config.ts		next.config.ts
package-lock.json		package-lock.json
package.json		package.json
postcss.config.mjs		postcss.config.mjs
tsconfig.json		tsconfig.json

Folders and files

Latest commit

History

Repository files navigation

Repolens

Overview

What this is NOT

Features

Architecture

Data Flow

Getting Started

Prerequisites

Installation

Environment Variables

Running Locally

Available Scripts

API Reference

POST /api/analyze

GET /api/health

Scoring Methodology

Category Weights

Hiring Confidence Levels

Analyzer Breakdown

README Analyzer (max 100 pts)

Commit Analyzer (max 100 pts)

Tech Stack Analyzer (max 100 pts)

Complexity Analyzer (max 100 pts)

Key Principle: Determinism

Tech Stack

Project Structure

Testing

Known Limitations

Contributing

Adding a New Analyzer

License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

`POST /api/analyze`

`GET /api/health`

Packages