A CLI tool to detect copy-paste code (duplicate pasta) in your TypeScript/JavaScript projects.
- Semantic Detection (default): Finds semantically similar functions using AST-based normalization (different variable names, same logic)
- Syntactic Detection: Finds exact duplicate code blocks using line-based hashing
bun install -g pastapolicenpm install -g pastapolicepastapolice <path> [options]| Option | Alias | Description | Default |
|---|---|---|---|
--min-lines |
-m |
Minimum lines for a code block | 5 |
--syntactic |
-s |
Use syntactic (line-based) detection | false |
# Scan current directory (semantic mode)
pastapolice .
# Scan with custom minimum lines
pastapolice . -m 10
# Syntactic mode (exact duplicates)
pastapolice . --syntactic
# Scan specific directory
pastapolice ./src -m 3Uses AST-based normalization to find functions that do the same thing but have different variable names, formatting, or minor syntax differences.
Finds exact duplicate code blocks by comparing normalized lines. Good for catching literal copy-paste.
- Parses TypeScript files using TypeScript compiler API
- Extracts function declarations, methods, arrow functions, and function expressions
- Normalizes by removing comments and replacing:
- Identifiers →
VAR1,VAR2,VAR3... - String/numeric literals →
LIT
- Identifiers →
- Hashes normalized representation
- Groups functions with identical hashes as semantic duplicates
- Sliding window scans files for consecutive lines
- Normalizes lines (removes whitespace)
- Hashes using xxhash64
- Groups identical hashes as duplicates
# Install dependencies
bun install
# Run directly
bun run src/index.ts .
# Build
bun run buildMIT