Production-grade semantic text analysis with embeddings, similarity computation, and vector search operations.
Part of the CrashBytes npm ecosystem | Built by Blackhole Software, LLC
When building ML-powered production systems, prioritize:
- Lazy initialization - Models load on demand, minimizing startup overhead
- Type safety - Comprehensive TypeScript definitions prevent runtime failures
- Resource efficiency - Quantized models reduce memory footprint by 75%
- Defensive programming - Semantic error codes enable precise debugging
npm install @crashbytes/semantic-text-toolkitimport { createSemanticEngine } from '@crashbytes/semantic-text-toolkit';
const engine = await createSemanticEngine();
const result = await engine.embed("Machine learning transforms data");
console.log(result.embedding); // 384-dimensional vector
const similarity = await engine.similarity(
"Artificial intelligence is fascinating",
"Machine learning is interesting"
);
console.log(similarity.score); // 0.78Transform text into high-dimensional numerical vectors that capture semantic meaning, enabling:
- Semantic similarity computation beyond keyword matching
- Vector-based search operations at scale
- Content clustering and classification
- Intelligent recommendation systems
Multiple metrics for domain-specific optimization:
- Cosine similarity - Preferred for normalized vectors (range: -1 to 1)
- Euclidean distance - Direct geometric distance in vector space
- Dot product - Efficient for pre-normalized embeddings
Production-ready semantic search with:
- Configurable ranking strategies
- Metadata filtering for complex queries
- O(n log k) complexity for top-k retrieval
- Index persistence through export/import
Core engine for embedding generation and similarity computation.
new SemanticEngine(config?: ModelConfig)Configuration Parameters:
modelName- Hugging Face model identifier (default:'Xenova/all-MiniLM-L6-v2')maxLength- Maximum sequence length (default:512)quantized- Enable quantization (default:true)onProgress- Progress callback for model loading
Initializes the model. Idempotent and concurrent-safe through promise caching.
Generates embedding for single text input. Returns vector with metadata.
Batch processing with automatic batching and progress tracking.
async similarity(textA: string, textB: string, method?: 'cosine' | 'euclidean' | 'dot'): Promise<SimilarityResult>
Computes semantic similarity using specified metric.
High-level search interface with indexing capabilities.
new SemanticSearch<T>(engine: SemanticEngine, config?: SearchConfig<T>)Configuration Parameters:
topK- Number of results to return (default:10)threshold- Minimum similarity score (default:0)textExtractor- Function to extract text from custom objectsmetadataExtractor- Function to extract metadata for filtering
Indexes items for semantic search with optional index replacement.
Performs semantic search with configurable parameters.
async searchWithFilter(query: string, filter: (metadata: Record<string, unknown>) => boolean): Promise<SearchResult<T>[]>
Searches with metadata filtering for complex queries.
interface Document {
id: string;
title: string;
content: string;
category: string;
}
const search = new SemanticSearch<Document>(engine, {
textExtractor: (doc) => `${doc.title} ${doc.content}`,
metadataExtractor: (doc) => ({ category: doc.category }),
});
await search.index(documents);
const results = await search.searchWithFilter(
"machine learning",
(metadata) => metadata.category === 'AI'
);import { centroid, cosineSimilarity } from '@crashbytes/semantic-text-toolkit';
const embeddings = await Promise.all(
documents.map(doc => engine.embed(doc))
);
const clusterCenter = centroid(embeddings.map(r => r.embedding));
const distances = embeddings.map(result =>
cosineSimilarity(result.embedding, clusterCenter)
);When optimizing for response time:
- Pre-initialize models at application startup
- Implement request batching for concurrent operations
- Enable GPU acceleration in production environments
- Use connection pooling for API deployments
When managing resource limitations:
- Leverage quantized models (enabled by default)
- Clear search indexes when not actively in use
- Process data in smaller, manageable batches
- Consider model distillation for further reduction
When scaling for volume:
- Implement worker pool pattern for parallel processing
- Use message queues (RabbitMQ, Redis) for load distribution
- Deploy on GPU-enabled infrastructure for compute-intensive workloads
- Utilize approximate nearest neighbor (ANN) algorithms for large-scale search
Single Embedding Generation:
- CPU (Apple M1): ~30ms
- CPU (Intel i7): ~50ms
- GPU (CUDA): ~5ms
Batch Processing (100 texts):
- Sequential: ~3000ms
- Batched (size=32): ~800ms
- Speedup: 3.75x
Memory Profile:
- Model (quantized): ~23MB
- Base runtime: ~100MB
- Per 1000 embeddings: ~1.5MB
const engine = new SemanticEngine({
modelName: 'Xenova/multilingual-e5-large',
maxLength: 512,
quantized: false
});const engine = new SemanticEngine({
modelName: 'Xenova/all-MiniLM-L6-v2',
quantized: true,
onProgress: (progress) => {
if (progress.status === 'downloading') {
logger.info(`Model download: ${progress.progress}%`);
}
}
});When contributing to this project:
- Self-documenting code - Clear variable names, focused functions
- Comprehensive test coverage - Unit, integration, and E2E tests
- Intentional design choices - Document architectural decisions
- Continuous refactoring - Maintain code health proactively
npm run buildGenerates:
dist/index.js(CommonJS)dist/index.mjs(ES Modules)dist/index.d.ts(TypeScript definitions)
Contributions welcome. When contributing:
- Maintain architectural consistency
- Add comprehensive tests
- Document public APIs
- Follow existing code style
- Update CHANGELOG.md
MIT License - see LICENSE file for details
Specializing in custom web and software solutions:
- React, Astro, Next.js
- Node.js, C#
- React Native, SwiftUI, Kotlin
- AI/ML integration
Visit us at blackholesoftware.com
Built with precision. Designed for production.