@principal-ai/markdown-search

High-performance full-text search for markdown documents using FlexSearch and Bun.

Features

🚀 Fast Performance - Built on Bun runtime for blazing fast file operations
🔍 Full-Text Search - Powered by FlexSearch for efficient indexing and searching
📝 Markdown-Optimized - Understands markdown structure (sections, code blocks, tables, etc.)
🎯 Flexible Searching - Search by document type, language, with fuzzy matching
💾 Persistent Indexes - Save and load search indexes for instant startup
🔌 Extensible - Adapter pattern for different platforms (Node, VS Code, etc.)
🏗️ TypeScript - Full TypeScript support with comprehensive types

Installation

bun add @principal-ai/markdown-search

Or with npm:

npm install @principal-ai/markdown-search

Quick Start

import { createSearchEngine } from '@principal-ai/markdown-search';

// Create a search engine instance
const searchEngine = createSearchEngine({
  rootPath: './docs',        // Directory to search
  storagePath: '.search',    // Where to store the index
  indexKey: 'my-docs'        // Name for this index
});

// Initialize and index files
await searchEngine.initialize();
await searchEngine.indexFiles();

// Search for content
const results = await searchEngine.search('your query');

results.forEach(result => {
  console.log(`${result.title} (${result.type})`);
  console.log(`Score: ${result.score}`);
  console.log(`File: ${result.fileName}`);
});

Advanced Usage

Custom Configuration

import { 
  SearchEngine, 
  NodeFileSystemAdapter, 
  NodeStorageAdapter,
  SearchEngineFactory 
} from '@principal-ai/markdown-search';

const searchEngine = new SearchEngine({
  fileSystem: new NodeFileSystemAdapter('./docs'),
  storage: new NodeStorageAdapter('.search-index'),
  searchEngine: SearchEngineFactory.create('flexsearch', {
    // FlexSearch options
    tokenize: 'forward',
    resolution: 9,
    depth: 3,
  })
});

Indexing with Progress

await searchEngine.indexFiles({
  onProgress: (progress) => {
    console.log(`${progress.phase}: ${progress.percentage}%`);
    if (progress.currentFile) {
      console.log(`Processing: ${progress.currentFile}`);
    }
  },
  batchSize: 10,
  indexChunks: true, // Index individual code blocks, tables, etc.
});

Search Options

const results = await searchEngine.search('query', {
  // Filter by document type
  types: ['section', 'code', 'table'],
  
  // Filter by programming language (for code blocks)
  languages: ['typescript', 'javascript'],
  
  // Fuzzy search threshold (0-1)
  fuzzyThreshold: 0.8,
  
  // Pagination
  limit: 10,
  offset: 0,
  
  // Search specific fields
  fields: ['content', 'title'],
  
  // Sort options
  sortBy: 'relevance',
  sortOrder: 'desc'
});

Document Types

The search engine understands different types of markdown content:

document - Entire markdown file
section - Document sections (based on headings)
code - Code blocks with language detection
mermaid - Mermaid diagrams
table - Markdown tables
heading - Individual headings
paragraph - Regular text paragraphs
list - List items
blockquote - Quoted text

Updating the Index

// Update specific files
await searchEngine.updateFiles([
  '/path/to/file1.md',
  '/path/to/file2.md'
]);

// Clear and rebuild index
await searchEngine.clearIndex();
await searchEngine.indexFiles();

Index Management

// Check if index exists
const hasIndex = await searchEngine.hasIndex();

// Get index statistics
const stats = await searchEngine.getStats();
console.log(`Total files: ${stats.totalFiles}`);
console.log(`Total documents: ${stats.totalDocuments}`);

// Export/Import index for backup
const indexData = await searchEngine.getSearchAdapter().exportIndex();
// ... save indexData somewhere ...

// Later, import it back
await searchEngine.getSearchAdapter().importIndex(indexData);

Platform Support

Node.js/Bun (Default)

The package includes built-in adapters for Node.js and Bun environments:

NodeFileSystemAdapter - File system operations using Bun's fast APIs
NodeStorageAdapter - File-based storage for indexes

VS Code Extension

The package maintains compatibility with VS Code extensions through included VS Code adapters:

import { 
  VSCodeFileSystemAdapter, 
  VSCodeStorageAdapter 
} from '@principal-ai/markdown-search/adapters';

Custom Adapters

You can create custom adapters for other platforms:

class MyCustomFileSystemAdapter implements SearchFileSystemAdapter {
  async findMarkdownFiles(options?: FindOptions): Promise<FileInfo[]> {
    // Your implementation
  }
  
  async readFile(path: string): Promise<string> {
    // Your implementation
  }
  
  // ... other required methods
}

API Reference

SearchEngine

The main class for searching markdown documents.

Constructor

new SearchEngine(config: SearchEngineConfig, indexKey?: string)

Methods

initialize(): Promise<void> - Initialize the search engine
indexFiles(options?: IndexingOptions): Promise<IndexResult> - Index all markdown files
search(query: string, options?: SearchOptions): Promise<SearchResult[]> - Search the index
updateFiles(paths: string[], options?: IndexingOptions): Promise<IndexResult> - Update specific files
clearIndex(): Promise<void> - Clear the entire index
hasIndex(): Promise<boolean> - Check if index exists
getStats(): Promise<SearchIndexStats | null> - Get index statistics

Types

See the types.ts file for all available TypeScript types.

Examples

Check the examples directory for more usage examples:

basic-search.ts - Basic search functionality

Performance

The package is optimized for performance:

Bun Runtime: Leverages Bun's fast file I/O operations
Batch Processing: Indexes files in configurable batches
Incremental Updates: Only re-index changed files
Persistent Indexes: Load pre-built indexes instantly

Development

# Install dependencies
bun install

# Run tests
bun test

# Build
bun run build

# Type checking
bun run typecheck

# Format code
bun run format

License

MIT

Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

Credits

Built by the A24Z Team as part of the markdown tooling ecosystem.

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
.alexandria/views		.alexandria/views
.github/workflows		.github/workflows
.husky		.husky
examples		examples
src		src
test-index		test-index
.alexandriarc.json		.alexandriarc.json
.gitignore		.gitignore
.prettierrc		.prettierrc
AGENTS.md		AGENTS.md
README.md		README.md
bun.lock		bun.lock
eslint.config.js		eslint.config.js
package.json		package.json
test-large-indexing.js		test-large-indexing.js
tsconfig.cjs.json		tsconfig.cjs.json
tsconfig.esm.json		tsconfig.esm.json
tsconfig.json		tsconfig.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

@principal-ai/markdown-search

Features

Installation

Quick Start

Advanced Usage

Custom Configuration

Indexing with Progress

Search Options

Document Types

Updating the Index

Index Management

Platform Support

Node.js/Bun (Default)

VS Code Extension

Custom Adapters

API Reference

SearchEngine

Constructor

Methods

Types

Examples

Performance

Development

License

Contributing

Credits

About

Uh oh!

Releases

Packages

Contributors 2

Uh oh!

Languages

principal-ai/markdown-search

Folders and files

Latest commit

History

Repository files navigation

@principal-ai/markdown-search

Features

Installation

Quick Start

Advanced Usage

Custom Configuration

Indexing with Progress

Search Options

Document Types

Updating the Index

Index Management

Platform Support

Node.js/Bun (Default)

VS Code Extension

Custom Adapters

API Reference

SearchEngine

Constructor

Methods

Types

Examples

Performance

Development

License

Contributing

Credits

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Uh oh!

Languages

Packages