A high-performance regular expression library for Python with JIT compilation and SIMD optimizations.
- JIT Compilation: LLVM-based just-in-time compilation for complex patterns
- SIMD Optimizations: AVX2/AVX512/SSE4.2/NEON support for vectorized operations
- Smart Caching: Automatic caching of compiled patterns to avoid recompilation
- Python Integration: Seamless integration via pybind11
- High Performance: 1.5-5x faster than standard
remodule for specific use cases
| Test Case | Python re (ms) | FastRegex (ms) | Speedup |
|---|---|---|---|
| Short literals | 0.0040 | 0.0023 | 1.7x β |
| Simple patterns | 0.0041 | 0.0025 | 1.6x β |
| Find all matches | 0.0127 | 0.0095 | 1.3x β |
| Match operations | 0.0040 | 0.0023 | 1.7x β |
Key insights:
- 1.5-1.9x faster for most use cases
- Best performance on short literals and simple patterns
- Fully compatible with standard
remodule behavior - Optimized for patterns < 50 characters
# Clone the repository
git clone https://github.com/baksvell/fastregex.git
cd fastregex
# Run with Docker
docker-compose up -d fastregex
# Enter the container
docker exec -it fastregex-dev bash
# Use FastRegex
python -c "import fastregex; print('FastRegex ready!')"pip install fastregexgit clone https://github.com/baksvell/fastregex.git
cd fastregex
pip install -e .- CMake 3.20+
- Python 3.10+
- C++17 compiler (GCC/MSVC/Clang)
import fastregex
# Simple search
result = fastregex.search(r'\d+', 'abc123def')
print(result) # True
# Find all matches
matches = fastregex.find_all(r'\w+', 'hello world test')
print(matches) # ['hello', 'world', 'test']
# Replace
new_text = fastregex.replace(r'\d+', 'abc123def456', 'XXX')
print(new_text) # 'abcXXXdefXXX'
# Compile for reuse
compiled = fastregex.compile(r'\d+')
result = compiled.search('abc123def')
print(result) # True# Check cache statistics
print(f"Cache size: {fastregex.cache_size()}")
print(f"Hit rate: {fastregex.hit_rate():.2%}")
# Pattern information
compiled = fastregex.compile(r'\d+')
print(f"Pattern: {compiled.pattern()}")
print(f"JIT compiled: {compiled.jit_compiled}")- Short literal patterns (1.7x faster)
- Simple regex patterns (1.6x faster)
- Match operations (1.7x faster)
- Find all operations (1.3x faster)
- Patterns < 50 characters
- Very large texts (>10MB)
- Complex regex patterns with many groups
- Need advanced regex features
- Long patterns (>50 characters)
import re
import fastregex as fr
def smart_match(pattern, text):
if len(pattern) > 15 and len(text) > 1000:
return fr.search(pattern, text)
return re.search(pattern, text)Run the test suite:
python -m pytest tests/Run performance benchmarks:
python tests/benchmark.pyfastregex.match(pattern, text)- Match from start of stringfastregex.search(pattern, text)- Search anywhere in stringfastregex.find_all(pattern, text)- Find all matchesfastregex.replace(pattern, text, replacement)- Replace matchesfastregex.compile(pattern)- Compile pattern for reuse
fastregex.cache_size()- Get current cache sizefastregex.hit_rate()- Get cache hit ratefastregex.clear_cache()- Clear the cache
compiled.pattern()- Get the compiled patterncompiled.jit_compiled- Check if pattern is JIT compiledcompiled.compile_time()- Get compilation time
Contributions are welcome! Please see CONTRIBUTING.md for details.
- Fork the repository
- Create a feature branch
- Make your changes
- Add tests
- Submit a pull request
This project is licensed under the MIT License - see the LICENSE file for details.