Skip to content

mlinaresweb/Code-Context-Diff

Repository files navigation

@mlinaresweb/code-context

Multilanguage code context extraction from Git diffs for AI documentation, code review, automated changelogs, pull request assistants and developer tooling.

@mlinaresweb/code-context takes a Git diff and builds a structured, AI-ready context pack by reading the changed files and discovering the surrounding code that matters: complete functions, classes, interfaces, imports, exports, dependency injection usage, framework references, WordPress hooks, AJAX handlers, REST routes, templates, CSS selectors, DOM ids, and cross-file usage.

The goal is simple: when an AI receives a diff, it should also receive the real code context needed to understand the change without sending the entire repository.


Table of contents


Why this library exists

Large Language Models can summarize code changes, write documentation and help review pull requests, but a raw diff is often not enough.

A diff may only show this:

- return this.userService.updateProfile(id, body);
+ return this.userService.updateProfile(id, body, context);

But to understand it properly, the AI may need:

  • the complete controller method,
  • the service method,
  • the DTO,
  • the repository,
  • dependency injection tokens,
  • imports and path aliases,
  • related framework decorators,
  • usage references,
  • tests or documentation constraints if available,
  • and, in WordPress projects, templates, AJAX handlers, hooks, partials and frontend assets.

This library builds that context automatically.

It is especially useful when building tools like:

  • AI documentation generators,
  • AI PR reviewers,
  • commit documentation assistants,
  • technical changelog generators,
  • codebase-aware agents,
  • automated onboarding explainers,
  • internal engineering documentation pipelines,
  • CI quality gates for documentation.

What it can do

Diff-based changed file detection

The library parses standard Git diffs and detects changed files and changed line ranges.

It can also use stagedFiles as a fallback when the diff does not contain enough file metadata.

Complete symbol extraction

It extracts complete code regions around changed lines instead of sending isolated fragments.

Examples of extracted symbols:

  • classes,
  • methods,
  • functions,
  • interfaces,
  • traits,
  • components,
  • controllers,
  • services,
  • hooks,
  • callbacks,
  • route handlers,
  • template windows,
  • fallback file windows when exact symbols cannot be resolved.

Cross-file context

The library can follow project relationships such as:

  • imports,
  • exports,
  • local module references,
  • usage references,
  • property references,
  • dependency injection references,
  • framework references,
  • WordPress template references,
  • WordPress AJAX actions,
  • WordPress hooks,
  • CSS and DOM relationships.

AI-ready Markdown

It renders the extracted context into Markdown that can be passed directly to an AI prompt.

Structured JSON pack

It also returns a structured CodeContextPack object that can be inspected, stored, analyzed or transformed.

Quality and trust scoring

The report includes:

  • quality score,
  • context trust status,
  • unresolved references,
  • documentation readiness,
  • diagnostics,
  • warnings,
  • recommendations.

Strict documentation mode

You can make the analysis fail if the context is not safe enough for documentation.

This is useful in CI or before publishing generated documentation.


Supported languages and ecosystems

The library is designed to be multilanguage and framework-aware.

Current focus:

Ecosystem Supported context
TypeScript classes, methods, interfaces, imports, exports, DI, usage
JavaScript functions, classes, imports, frontend usage, globals
TSX / JSX React components, hooks, imports, usage
Node.js services, controllers, dependency graph
NestJS decorators, controllers, services, DTOs, DI context
tsyringe container resolution and dependency injection references
Next.js TSX/React context, route/app related code patterns
Nuxt / Vue SFC parsing, script blocks, provide/inject references
PHP classes, traits, interfaces, functions, procedural helpers
WordPress hooks, filters, AJAX, REST, shortcodes, templates, ACF, CSS/DOM
Python functions, classes, FastAPI-like dependency context
CSS / SCSS selectors, related DOM classes, style context
SQL fallback context for changed SQL files

The design is intentionally extensible. More language-specific extractors can be added over time.


Installation

npm install @mlinaresweb/code-context

For local development inside this repository:

npm install
npm run check
npm run test
npm run build

Requirements

  • Node.js >=20
  • ESM project support
  • A Git diff string as input
  • Access to the local repository files being analyzed

The library reads files from the local filesystem. It does not need a remote backend.


Quick start

import { buildCodeContextReport } from '@mlinaresweb/code-context';

const diff = `
diff --git a/src/user.service.ts b/src/user.service.ts
index 1111111..2222222 100644
--- a/src/user.service.ts
+++ b/src/user.service.ts
@@ -1,6 +1,6 @@
 export class UserService {
   public getName(): string {
-    return 'Ada';
+    return 'Ada Lovelace';
   }
 }
`;

const report = await buildCodeContextReport({
  repositoryRoot: process.cwd(),
  diff,
  stagedFiles: ['src/user.service.ts'],
  config: {
    preset: 'fullstack',
  },
  documentationContext: {
    mode: 'warn',
    minimumTrustStatus: 'safe',
  },
});

console.log(report.markdown);
console.log(report.quality.contextTrustStatus);
console.log(report.documentation.passed);

Core concepts

CodeContextPack

The raw structured output.

It contains changed files, changed symbols, related symbols, references, diagnostics and warnings.

CodeContextReport

A higher-level result produced by buildCodeContextReport.

It includes:

  • pack
  • markdown
  • debugMarkdown
  • summary
  • quality
  • documentation
  • warnings

Changed symbols

Symbols directly affected by the diff.

Examples:

  • changed function,
  • changed method,
  • changed class,
  • changed template window,
  • changed component.

Related symbols

Symbols not directly changed but required to understand the change.

Examples:

  • imported service,
  • DTO used by a controller,
  • repository used by a service,
  • AJAX handler called by frontend JS,
  • CSS file defining classes used by a changed template,
  • WordPress partial included by get_template_part.

Diagnostics

Debug-level or quality-level facts generated during analysis.

Examples:

  • WordPress index built,
  • WordPress index cache hit,
  • related file included,
  • reference resolved,
  • reference unresolved.

Documentation readiness

A strict or warning-based quality gate that tells you whether the context is safe enough to generate final documentation.


Getting a Git diff

The library expects a diff string. You can obtain it in different ways.

Staged changes

git diff --staged

Node example:

import { execFile } from 'node:child_process';
import { promisify } from 'node:util';

const execFileAsync = promisify(execFile);

const { stdout: diff } = await execFileAsync('git', [
  '-C',
  repositoryRoot,
  'diff',
  '--staged',
]);

Unstaged changes

git diff

Compare two commits

git diff HEAD~1..HEAD

Compare two branches

git diff main..feature/my-branch

Get changed file names

git diff --name-only --staged

You can pass those file names as stagedFiles:

const report = await buildCodeContextReport({
  repositoryRoot,
  diff,
  stagedFiles,
});

GitHub Pull Request diff

With GitHub CLI:

gh pr diff 123 > pr.diff

Then read the file and pass it to the library:

import { readFile } from 'node:fs/promises';

const diff = await readFile('pr.diff', 'utf8');

const report = await buildCodeContextReport({
  repositoryRoot,
  diff,
});

GitHub Actions example

name: Code Context

on:
  pull_request:

jobs:
  context:
    runs-on: ubuntu-latest

    steps:
      - uses: actions/checkout@v4
        with:
          fetch-depth: 0

      - uses: actions/setup-node@v4
        with:
          node-version: 20

      - run: npm ci

      - name: Generate diff
        run: git diff origin/${{ github.base_ref }}...HEAD > pr.diff

      - name: Generate code context report
        run: node scripts/generate-code-context.js

Main API

buildCodeContextReport

Recommended for most users.

import { buildCodeContextReport } from '@mlinaresweb/code-context';

const report = await buildCodeContextReport({
  repositoryRoot,
  diff,
  stagedFiles,
  config: {
    preset: ['fullstack', 'documentation'],
  },
  debugOptions: {
    includePromptMarkdown: true,
    includeDiagnostics: true,
  },
  documentationContext: {
    mode: 'warn',
    minimumTrustStatus: 'safe',
  },
});

Returns:

{
  pack,
  markdown,
  debugMarkdown,
  summary,
  quality,
  documentation,
  warnings
}

buildCodeContextPack

Lower-level API.

Use this if you want the raw context pack and plan to render or process it yourself.

import { buildCodeContextPack } from '@mlinaresweb/code-context';

const pack = await buildCodeContextPack({
  repositoryRoot,
  diff,
  stagedFiles,
  config: {
    preset: 'balanced',
  },
});

renderCodeContextForPrompt

Render a CodeContextPack to Markdown.

import {
  buildCodeContextPack,
  renderCodeContextForPrompt,
} from '@mlinaresweb/code-context';

const pack = await buildCodeContextPack({
  repositoryRoot,
  diff,
});

const markdown = renderCodeContextForPrompt({
  pack,
});

writeCodeContextReportFiles

Writes ready-to-inspect files.

import { writeCodeContextReportFiles } from '@mlinaresweb/code-context';

const result = await writeCodeContextReportFiles({
  repositoryRoot,
  diff,
  stagedFiles,
  outputDirectory: '.code-context-report',
  config: {
    preset: 'wordpress-large',
  },
});

Generated files:

  • code-context-report.md
  • code-context-debug.md
  • code-context-pack.json
  • code-context-summary.json

Presets

Presets are the easiest way to configure the library.

config: {
  preset: 'wordpress-large',
}

You can combine presets:

config: {
  preset: ['fullstack', 'wordpress-large', 'documentation'],
}

Manual config always wins:

config: {
  preset: 'wordpress-large',
  maxCharacters: 900_000,
}

Available presets

Preset Purpose
balanced General-purpose default behavior
documentation More context for AI documentation
fullstack Broad JS/TS/PHP/Python/frontend backend context
wordpress WordPress projects with moderate size
wordpress-large Large WordPress themes/plugins and custom platforms
ci-fast Faster analysis for CI pipelines

Documentation trust and strict mode

The report includes a trust status:

report.quality.contextTrustStatus

Possible values:

Status Meaning
safe Context is good enough for final documentation
partial Context is useful but should be reviewed
unsafe Important context is missing

Warn mode

Default recommended mode for interactive documentation tools.

documentationContext: {
  mode: 'warn',
  minimumTrustStatus: 'safe',
}

The function does not throw. You can inspect:

report.documentation.passed
report.documentation.blockingIssues

Strict mode

Recommended for CI or publishing gates.

documentationContext: {
  mode: 'strict',
  minimumTrustStatus: 'safe',
}

If the context does not meet the required trust level, the library throws:

CodeContextUnsafeDocumentationContextError

Example:

import {
  CodeContextUnsafeDocumentationContextError,
  buildCodeContextReport,
} from '@mlinaresweb/code-context';

try {
  const report = await buildCodeContextReport({
    repositoryRoot,
    diff,
    documentationContext: {
      mode: 'strict',
      minimumTrustStatus: 'safe',
    },
  });
} catch (error) {
  if (error instanceof CodeContextUnsafeDocumentationContextError) {
    console.error(error.readiness.message);
    console.error(error.quality.contextTrustStatus);
  }

  throw error;
}

Permissive mode

Never blocks, even if context is unsafe.

documentationContext: {
  mode: 'permissive',
  minimumTrustStatus: 'safe',
}

WordPress support

WordPress support is one of the strongest parts of the library.

It is designed for real-world projects where functionality is spread across:

  • theme templates,
  • plugin bootstrap files,
  • functions.php,
  • procedural helpers,
  • classes,
  • AJAX handlers,
  • REST controllers,
  • shortcodes,
  • template parts,
  • frontend JS,
  • inline scripts,
  • CSS/SCSS,
  • ACF fields,
  • custom post types,
  • taxonomies,
  • page builders and integrations.

WordPress Project Index

The library builds an automatic WordPress index for the current theme/plugin root.

It indexes:

  • PHP functions,
  • PHP classes,
  • interfaces,
  • traits,
  • hooks,
  • filters,
  • AJAX actions,
  • REST routes,
  • shortcodes,
  • template references,
  • enqueue handles,
  • asset paths,
  • custom post types,
  • taxonomies,
  • ACF fields,
  • blocks,
  • JS functions,
  • window globals,
  • JS AJAX actions,
  • CSS classes,
  • DOM ids,
  • file basenames,
  • path tokens.

This is automatic. You do not need to hardcode project prefixes.

AJAX example

Changed template:

<script>
fetch('/wp-admin/admin-ajax.php?action=choose_delivery')
</script>

The library tries to find:

add_action('wp_ajax_choose_delivery', '...');
add_action('wp_ajax_nopriv_choose_delivery', '...');

and includes the handler context.

Template part example

Changed template:

get_template_part('template-parts/car-form/booking-summary');

The library tries to include:

template-parts/car-form/booking-summary.php

CSS and DOM relationship example

Changed template:

<div id="checkout-summary" class="booking-summary-wrapper">

The library can include related CSS/JS files that reference:

.booking-summary-wrapper {}
#checkout-summary {}

or JS:

document.getElementById('checkout-summary')

WordPress coverage

The library reports unresolved important references.

Examples:

  • AJAX action used but no handler found,
  • template referenced but file not found,
  • PHP custom function called but not found,
  • JS global used but not found,
  • asset referenced but not found.

This helps prevent AI hallucination.


Debug reports and diagnostics

The debug report explains what happened internally.

console.log(report.debugMarkdown);

It can include:

  • summary,
  • quality metrics,
  • warnings,
  • changed files,
  • changed symbols,
  • related symbols,
  • references,
  • diagnostics,
  • WordPress Project Index debug,
  • resolved references,
  • unresolved references,
  • LLM-ready Markdown.

Useful diagnostics include:

Diagnostic code Meaning
wordpress-index-summary WordPress index was built
wordpress-index-cache-hit Reused cached index
wordpress-index-cache-miss Built new index
wordpress-related-file-included Related file included
wordpress-reference-resolved Reference resolved
wordpress-reference-unresolved Reference unresolved

Example:

const unresolved = report.pack.diagnostics?.filter((diagnostic) => {
  return diagnostic.code === 'wordpress-reference-unresolved';
});

Writing report files

Use writeCodeContextReportFiles to inspect context manually.

await writeCodeContextReportFiles({
  repositoryRoot,
  diff,
  stagedFiles,
  outputDirectory: './.code-context-report',
  config: {
    preset: ['wordpress-large', 'documentation'],
  },
  debugOptions: {
    includePromptMarkdown: true,
    includeDiagnostics: true,
  },
});

Output:

.code-context-report/
├─ code-context-report.md
├─ code-context-debug.md
├─ code-context-pack.json
└─ code-context-summary.json

CLI/report script usage

If you use the included report script during development, you can generate reports from Git.

Staged changes

npm run report -- --repo "/path/to/project" --staged --out "./code-context-output"

Unstaged changes

npm run report -- --repo "/path/to/project" --unstaged --out "./code-context-output"

Commit range

npm run report -- --repo "/path/to/project" --base HEAD~1 --head HEAD --out "./code-context-output"

WordPress large preset

npm run report -- --repo "/path/to/project" --staged --preset wordpress-large --out "./code-context-output"

Multiple presets

npm run report -- --repo "/path/to/project" --staged --preset fullstack,wordpress-large,documentation --out "./code-context-output"

Strict mode

npm run report -- --repo "/path/to/project" --staged --strict --minimum-trust safe --out "./code-context-output"

Configuration reference

The config object is resolved as:

defaults < preset < user config

Example:

config: {
  preset: 'wordpress-large',
  maxCharacters: 900_000,
}

Common options

Option Description
enabled Enable or disable code context
preset Preset or list of presets
maxCharacters Global context character budget
maxRelatedSymbols Maximum related symbols
maxExternalFiles Maximum external files to inspect/include
radiusLines Fallback line radius around changed lines
dependencyResolutionEnabled Follow imports/modules
frameworkDetectionEnabled Detect framework references
frameworkSymbolLinkingEnabled Link framework refs to symbols
frameworkDeepLinkingEnabled Include deeper framework-related context
usageAnalysisEnabled Extract usage references
crossFileUsageResolutionEnabled Resolve usage across files
dependencyInjectionDetectionEnabled Detect DI references
crossFileDependencyInjectionResolutionEnabled Resolve DI across files
propertyInferenceEnabled Extract property references
contextPrioritizationEnabled Prioritize context by relevance
qualityValidationEnabled Add quality validation

WordPress options

Option Description
wordpressIndexCacheEnabled Reuse WordPress index within analysis
wordpressIndexMaxFiles Max files scanned by WordPress index

Output structure

BuildCodeContextReportResult

interface BuildCodeContextReportResult {
  readonly pack: CodeContextPack;
  readonly markdown: string;
  readonly debugMarkdown: string;
  readonly summary: CodeContextReportSummary;
  readonly quality: CodeContextQualityReport;
  readonly documentation: CodeContextDocumentationReadiness;
  readonly warnings: readonly string[];
}

CodeContextReportSummary

Contains:

  • total characters,
  • changed file count,
  • changed symbol count,
  • related symbol count,
  • reference counts,
  • warning count,
  • diagnostic count,
  • languages,
  • file paths,
  • symbols by relevance.

CodeContextQualityReport

Contains:

  • status,
  • contextTrustStatus,
  • score,
  • coverage metrics,
  • penalties,
  • unresolved reference counts,
  • isSafeForDocumentation,
  • issues,
  • recommendations.

Examples

Node / TypeScript service

const report = await buildCodeContextReport({
  repositoryRoot,
  diff,
  stagedFiles: ['src/users/user.service.ts'],
  config: {
    preset: 'fullstack',
  },
});

The report may include:

  • changed service method,
  • imported repository,
  • DTO,
  • interface,
  • dependency injection tokens,
  • usage references.

NestJS

const report = await buildCodeContextReport({
  repositoryRoot,
  diff,
  stagedFiles: ['apps/api/src/users/users.controller.ts'],
  config: {
    preset: ['fullstack', 'documentation'],
  },
});

WordPress template

const report = await buildCodeContextReport({
  repositoryRoot,
  diff,
  stagedFiles: ['wp-content/themes/my-theme/single-car-checkout.php'],
  config: {
    preset: 'wordpress-large',
  },
});

Python

const report = await buildCodeContextReport({
  repositoryRoot,
  diff,
  stagedFiles: ['app/api/users.py'],
  config: {
    preset: 'fullstack',
  },
});

React / Next

const report = await buildCodeContextReport({
  repositoryRoot,
  diff,
  stagedFiles: ['app/users/UserProfile.tsx'],
  config: {
    preset: 'fullstack',
  },
});

Vue / Nuxt

const report = await buildCodeContextReport({
  repositoryRoot,
  diff,
  stagedFiles: ['components/UserCard.vue'],
  config: {
    preset: 'fullstack',
  },
});

Recommended workflows

AI documentation generation

config: {
  preset: ['fullstack', 'documentation'],
},
documentationContext: {
  mode: 'warn',
  minimumTrustStatus: 'safe',
}

For large WordPress projects:

config: {
  preset: ['wordpress-large', 'documentation'],
},
documentationContext: {
  mode: 'warn',
  minimumTrustStatus: 'safe',
}

CI gate

documentationContext: {
  mode: 'strict',
  minimumTrustStatus: 'safe',
}

Fast pull request summary

config: {
  preset: 'ci-fast',
},
documentationContext: {
  mode: 'warn',
  minimumTrustStatus: 'partial',
}

Performance and large repositories

For large repositories:

  • use wordpress-large only when needed,
  • keep wordpressIndexCacheEnabled: true,
  • increase wordpressIndexMaxFiles if the project is very large,
  • inspect code-context-debug.md,
  • review cache diagnostics,
  • increase maxCharacters only when the AI model can handle it.

Example:

config: {
  preset: 'wordpress-large',
  wordpressIndexMaxFiles: 10_000,
  maxCharacters: 900_000,
  maxExternalFiles: 220,
  maxRelatedSymbols: 500,
}

The WordPress index cache is per analysis execution. It does not persist between commands, which avoids stale results.


Troubleshooting

The report is unsafe

Check:

report.quality.criticalUnresolvedReferenceCount
report.quality.reportableUnresolvedReferenceCount
report.documentation.blockingIssues

Open code-context-debug.md and look at:

  • unresolved WordPress references,
  • warnings,
  • included files,
  • coverage report.

A WordPress AJAX handler was not found

Make sure the project contains:

add_action('wp_ajax_my_action', 'my_callback');

or:

add_action('wp_ajax_nopriv_my_action', 'my_callback');

and that the file is inside the same theme/plugin root or within the configured scan limit.

A template part was not found

For:

get_template_part('template-parts/example');

the library expects:

template-parts/example.php

A context report is too large

Reduce:

maxCharacters
maxRelatedSymbols
maxExternalFiles

or use:

preset: 'ci-fast'

A context report is missing important files

Increase:

maxExternalFiles
maxRelatedSymbols
wordpressIndexMaxFiles
maxCharacters

and inspect diagnostics in debugMarkdown.


Limitations

No context extraction library can guarantee perfect results for every repository.

Known limitations:

  • highly dynamic imports may not always resolve,
  • runtime-generated WordPress hooks may not be fully detected,
  • complex PHP variable-based template paths may need fallback context,
  • minified files are intentionally not ideal inputs,
  • very large files may be truncated by configured character budgets,
  • framework-specific support grows over time.

The library is designed to prefer truthful, inspectable context over hallucinated certainty.

If something cannot be resolved, it should emit warnings or diagnostics.


Roadmap

Planned improvements:

  • persistent optional cache,
  • more language-specific extractors,
  • deeper Python import resolution,
  • deeper Next.js route/app directory awareness,
  • deeper Nuxt auto-import awareness,
  • improved PHP namespace/class resolution,
  • better Composer/autoload understanding,
  • better WordPress block/theme.json context,
  • CLI package command,
  • GitHub Action wrapper,
  • HTML report output,
  • VS Code extension integration.

Contributing

Contributions are welcome.

Good contribution areas:

  • new language extractors,
  • better framework support,
  • more WordPress fixtures,
  • performance improvements,
  • diagnostics improvements,
  • README/examples,
  • bug reproductions with small fixtures.

Recommended workflow:

npm install
npm run check
npm run test
npm run build

Before opening a pull request:

npm run check
npm run test
npm run build
npm run pack:dry

Please include tests for new extraction behavior.


License and attribution

This project is licensed under the MIT License.

That means you can:

  • use it commercially,
  • modify it,
  • fork it,
  • include it in your own tools,
  • publish improvements,
  • distribute copies.

The MIT License requires keeping the copyright and license notice in copies or substantial portions of the software.

Please preserve attribution to:

mlinaresweb

when reusing, forking or publishing derived versions of this library.

See LICENSE and NOTICE.md.

About

No description, website, or topics provided.

Resources

License

Contributing

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors