# LoPace: Complete User Guide

**Lossless Optimized Prompt Accurate Compression Engine**

This notebook provides a comprehensive guide to using LoPace for compressing and decompressing prompts with various configurations and use cases.

## Table of Contents

1. [Introduction](#introduction)
2. [Installation](#installation)
3. [Quick Start](#quick-start)
4. [Compression Methods](#compression-methods)
   - Zstd Compression
   - Token-based Compression
   - Hybrid Compression (Recommended)
5. [Configuration Options](#configuration-options)
   - Tokenizer Models
   - Zstd Compression Levels
6. [Advanced Usage](#advanced-usage)
   - Cross-Instance Compression/Decompression
   - Batch Processing
   - Compression Statistics
7. [Real-World Examples](#real-world-examples)
8. [Best Practices](#best-practices)
9. [Performance Benchmarks](#performance-benchmarks)

## Introduction

LoPace is a professional Python package for compressing and decompressing prompts using multiple lossless compression techniques. It's designed to help you:

- **Reduce storage costs**: Achieve up to 80% space reduction
- **Improve performance**: Fast compression/decompression speeds (50-200 MB/s)
- **Maintain data integrity**: 100% lossless compression guarantees
- **Scale efficiently**: Optimized for databases and large-scale applications

### Key Features

- âœ… **Three Compression Methods**: Zstd, Token-based (BPE), and Hybrid
- âœ… **Lossless**: Perfect reconstruction of original prompts
- âœ… **Configurable**: Multiple tokenizer models and compression levels
- âœ… **Production-Ready**: Minimal memory footprint and excellent scalability

## Installation

Install LoPace using pip:

In [None]:
# Install LoPace (if not already installed)
# !pip install lopace

# Import the required modules
from lopace import PromptCompressor, CompressionMethod
import sys

print(f"Python version: {sys.version}")
print(f"LoPace imported successfully!")

## Quick Start

The simplest way to use LoPace - compress and decompress a prompt:

In [None]:
# Initialize compressor with default settings
compressor = PromptCompressor()

# Your prompt
prompt = "You are a helpful AI assistant designed to provide accurate and detailed responses."

# Compress using hybrid method (recommended - best compression)
compressed = compressor.compress(prompt, CompressionMethod.HYBRID)

# Decompress back to original
original = compressor.decompress(compressed, CompressionMethod.HYBRID)

# Verify losslessness
print(f"Original: {prompt}")
print(f"Decompressed: {original}")
print(f"Match: {original == prompt} âœ“")
print(f"\nOriginal size: {len(prompt.encode('utf-8'))} bytes")
print(f"Compressed size: {len(compressed)} bytes")
print(f"Space saved: {(1 - len(compressed)/len(prompt.encode('utf-8')))*100:.1f}%")

## Compression Methods

LoPace provides three compression methods, each with different characteristics:

### Method 1: Zstd Compression

Uses Zstandard's dictionary-based algorithm to find repeated patterns. Best for general text compression when tokenization overhead is not needed.

In [None]:
compressor = PromptCompressor()

prompt = """You are an expert software engineer with expertise in Python, JavaScript, 
and cloud architecture. Provide detailed, well-structured code examples and explanations."""

# Zstd compression
zstd_compressed = compressor.compress_zstd(prompt)
zstd_decompressed = compressor.decompress_zstd(zstd_compressed)

print("=== Zstd Compression ===")
print(f"Original size: {len(prompt.encode('utf-8'))} bytes")
print(f"Compressed size: {len(zstd_compressed)} bytes")
print(f"Compression ratio: {len(prompt.encode('utf-8'))/len(zstd_compressed):.2f}x")
print(f"Space saved: {(1 - len(zstd_compressed)/len(prompt.encode('utf-8')))*100:.1f}%")
print(f"Lossless: {zstd_decompressed == prompt} âœ“")

### Method 2: Token-based Compression

Uses Byte-Pair Encoding (BPE) to convert text to token IDs, then packs them as binary data. Best when you need token IDs anyway or are working with LLM tokenizers.

In [None]:
# Token-based compression
token_compressed = compressor.compress_token(prompt)
token_decompressed = compressor.decompress_token(token_compressed)

print("=== Token-based Compression ===")
print(f"Original size: {len(prompt.encode('utf-8'))} bytes")
print(f"Compressed size: {len(token_compressed)} bytes")
print(f"Compression ratio: {len(prompt.encode('utf-8'))/len(token_compressed):.2f}x")
print(f"Space saved: {(1 - len(token_compressed)/len(prompt.encode('utf-8')))*100:.1f}%")
print(f"Lossless: {token_decompressed == prompt} âœ“")

### Method 3: Hybrid Compression (Recommended)

Combines tokenization and Zstd compression for maximum efficiency. This is the **recommended method** for database storage where maximum compression is needed.

In [None]:
# Hybrid compression (best compression ratio)
hybrid_compressed = compressor.compress_hybrid(prompt)
hybrid_decompressed = compressor.decompress_hybrid(hybrid_compressed)

print("=== Hybrid Compression (Recommended) ===")
print(f"Original size: {len(prompt.encode('utf-8'))} bytes")
print(f"Compressed size: {len(hybrid_compressed)} bytes")
print(f"Compression ratio: {len(prompt.encode('utf-8'))/len(hybrid_compressed):.2f}x")
print(f"Space saved: {(1 - len(hybrid_compressed)/len(prompt.encode('utf-8')))*100:.1f}%")
print(f"Lossless: {hybrid_decompressed == prompt} âœ“")

# Compare all methods
print("\n=== Comparison of All Methods ===")
methods = {
    "Zstd": (zstd_compressed, zstd_decompressed),
    "Token": (token_compressed, token_decompressed),
    "Hybrid": (hybrid_compressed, hybrid_decompressed)
}

for method_name, (compressed, decompressed) in methods.items():
    ratio = len(prompt.encode('utf-8'))/len(compressed)
    saved = (1 - len(compressed)/len(prompt.encode('utf-8')))*100
    print(f"{method_name:8s}: {len(compressed):4d} bytes, {ratio:.2f}x ratio, {saved:5.1f}% saved")

## Configuration Options

LoPace provides several configuration options to optimize compression for your specific use case:

### Tokenizer Models

Different tokenizer models can be used depending on your needs. The default is `cl100k_base` (OpenAI's GPT-4 tokenizer).

In [None]:
prompt = "The quick brown fox jumps over the lazy dog. ðŸ¦Š"

# Test different tokenizer models
models = ["cl100k_base", "p50k_base", "r50k_base"]

print("=== Comparing Tokenizer Models ===")
print(f"Prompt: {prompt}\n")

for model in models:
    try:
        compressor = PromptCompressor(model=model)
        
        # Get token count
        tokens = compressor.tokenizer.encode(prompt)
        
        # Compress using hybrid method
        compressed = compressor.compress_hybrid(prompt)
        
        print(f"\nModel: {model}")
        print(f"  Token count: {len(tokens)}")
        print(f"  Original size: {len(prompt.encode('utf-8'))} bytes")
        print(f"  Compressed size: {len(compressed)} bytes")
        print(f"  Space saved: {(1 - len(compressed)/len(prompt.encode('utf-8')))*100:.1f}%")
    except Exception as e:
        print(f"\nModel: {model} - Error: {e}")

### Zstd Compression Levels

Zstd compression levels range from 1 (fastest, less compression) to 22 (slowest, best compression). The default is 15 (balanced).

In [None]:
import time

long_prompt = """You are a comprehensive AI assistant specializing in technical documentation 
and educational content. Your expertise spans multiple domains including computer science, 
data science, machine learning, software engineering, and web development. When responding 
to queries, you should provide thorough explanations, include relevant examples, and 
structure your responses in a clear and organized manner.""" * 5

print("=== Comparing Zstd Compression Levels ===")
print(f"Prompt length: {len(long_prompt)} characters\n")

# Test different compression levels
levels = [1, 5, 10, 15, 19, 22]

results = []
for level in levels:
    compressor = PromptCompressor(zstd_level=level)
    
    # Time the compression
    start = time.perf_counter()
    compressed = compressor.compress_hybrid(long_prompt)
    compression_time = (time.perf_counter() - start) * 1000  # ms
    
    original_size = len(long_prompt.encode('utf-8'))
    compressed_size = len(compressed)
    space_saved = (1 - compressed_size/original_size) * 100
    
    results.append({
        'level': level,
        'size': compressed_size,
        'time': compression_time,
        'saved': space_saved
    })
    
    print(f"Level {level:2d}: {compressed_size:5d} bytes, "
          f"{space_saved:5.1f}% saved, {compression_time:6.2f} ms")

print("\nðŸ’¡ Tip: Higher levels (15-19) provide best balance of compression and speed.")
print("      Level 22 maximizes compression but can be significantly slower.")

## Advanced Usage

### Cross-Instance Compression/Decompression

**Important**: You can compress with one instance and decompress with another, as long as you use the **same tokenizer model**.

In [None]:
prompt = "This is a test prompt for cross-instance compression."

# Compress with one instance
compressor1 = PromptCompressor(model="cl100k_base", zstd_level=15)
compressed = compressor1.compress_hybrid(prompt)

# Decompress with a NEW instance (same model)
compressor2 = PromptCompressor(model="cl100k_base", zstd_level=20)  # Different zstd_level is OK
original = compressor2.decompress_hybrid(compressed)

print("=== Cross-Instance Compression/Decompression ===")
print(f"Original: {prompt}")
print(f"Decompressed: {original}")
print(f"Match: {original == prompt} âœ“")
print("\nâœ… Works perfectly as long as both instances use the same tokenizer model!")

### Batch Processing

Compress and decompress multiple prompts efficiently:

In [None]:
compressor = PromptCompressor()

# Batch of prompts
prompts = [
    "You are a helpful AI assistant.",
    "Translate the following text to French.",
    "Summarize this document in 3 sentences.",
    "You are an expert Python developer.",
    "Explain the concept of machine learning."
]

print("=== Batch Processing ===")
print(f"Processing {len(prompts)} prompts...\n")

compressed_batch = []
decompressed_batch = []

for i, prompt in enumerate(prompts, 1):
    # Compress
    compressed = compressor.compress_hybrid(prompt)
    compressed_batch.append(compressed)
    
    # Decompress
    decompressed = compressor.decompress_hybrid(compressed)
    decompressed_batch.append(decompressed)
    
    # Verify
    original_size = len(prompt.encode('utf-8'))
    compressed_size = len(compressed)
    space_saved = (1 - compressed_size/original_size) * 100
    
    print(f"Prompt {i}: {original_size:3d} â†’ {compressed_size:3d} bytes "
          f"({space_saved:5.1f}% saved), Lossless: {decompressed == prompt} âœ“")

# Verify all
all_match = all(orig == decomp for orig, decomp in zip(prompts, decompressed_batch))
print(f"\nâœ… All prompts processed successfully: {all_match}")

### Compression Statistics

Get detailed statistics about compression performance:

In [None]:
compressor = PromptCompressor()

prompt = """You are a comprehensive AI assistant specializing in technical documentation 
and educational content. Your expertise spans multiple domains including computer science, 
data science, machine learning, software engineering, and web development."""

# Get statistics for all methods
stats = compressor.get_compression_stats(prompt)

print("=== Compression Statistics ===")
print(f"Original Size: {stats['original_size_bytes']} bytes")
print(f"Original Tokens: {stats['original_size_tokens']}")
print("\nMethod Comparison:")
print("-" * 70)

for method, method_stats in stats['methods'].items():
    print(f"\n{method.upper()}:")
    print(f"  Compressed Size: {method_stats['compressed_size_bytes']} bytes")
    print(f"  Compression Ratio: {method_stats['compression_ratio']:.2f}x")
    print(f"  Space Saved: {method_stats['space_saved_percent']:.2f}%")
    print(f"  Bytes Saved: {method_stats['bytes_saved']} bytes")

# Get stats for a specific method
print("\n" + "=" * 70)
print("Statistics for Hybrid method only:")
hybrid_stats = compressor.get_compression_stats(prompt, CompressionMethod.HYBRID)
print(f"Compressed Size: {hybrid_stats['methods']['hybrid']['compressed_size_bytes']} bytes")
print(f"Space Saved: {hybrid_stats['methods']['hybrid']['space_saved_percent']:.2f}%")

### Using `compress_and_return_both`

A convenience method that returns both the original and compressed versions:

In [None]:
compressor = PromptCompressor()

prompt = "You are a helpful AI assistant."

# Get both original and compressed
original, compressed = compressor.compress_and_return_both(prompt, CompressionMethod.HYBRID)

print("=== Using compress_and_return_both ===")
print(f"Original: {original}")
print(f"Original stored: {original == prompt} âœ“")
print(f"Compressed size: {len(compressed)} bytes")
print(f"Original size: {len(prompt.encode('utf-8'))} bytes")
print(f"Space saved: {(1 - len(compressed)/len(prompt.encode('utf-8')))*100:.1f}%")

## Real-World Examples

### Example 1: Storing System Prompts in a Database

Save space when storing system prompts for multiple users:

In [None]:
compressor = PromptCompressor()

# System prompt template
system_prompt = """You are a professional customer service AI assistant. 
Your role is to help customers with their inquiries in a friendly, helpful, and efficient manner. 
Always be polite, professional, and aim to resolve issues quickly. If you cannot help, 
escalate to a human agent."""

# Simulate storing for multiple users
num_users = 1000
original_total = len(system_prompt.encode('utf-8')) * num_users

# Compress once and reuse
compressed_prompt = compressor.compress_hybrid(system_prompt)
compressed_total = len(compressed_prompt) * num_users

print("=== Database Storage Example ===")
print(f"System prompt size: {len(system_prompt.encode('utf-8'))} bytes")
print(f"Compressed size: {len(compressed_prompt)} bytes")
print(f"\nFor {num_users:,} users:")
print(f"  Original total: {original_total:,} bytes ({original_total/1024/1024:.2f} MB)")
print(f"  Compressed total: {compressed_total:,} bytes ({compressed_total/1024/1024:.2f} MB)")
print(f"  Space saved: {(1 - compressed_total/original_total)*100:.1f}%")
print(f"  Savings: {(original_total - compressed_total)/1024/1024:.2f} MB")

# Verify we can decompress
decompressed = compressor.decompress_hybrid(compressed_prompt)
print(f"\nâœ… Lossless: {decompressed == system_prompt}")

### Example 2: Compressing Conversation Histories

Store conversation histories efficiently:

In [None]:
compressor = PromptCompressor()

# Simulate a conversation history
conversation = """User: What is Python?
Assistant: Python is a high-level programming language known for its simplicity and readability.
User: Can you give me an example?
Assistant: Sure! Here's a simple example:
```python
def greet(name):
    return f"Hello, {name}!"
print(greet("World"))
```
User: Thanks!
Assistant: You're welcome! Let me know if you need any other help."""

# Compress conversation
compressed_conv = compressor.compress_hybrid(conversation)
decompressed_conv = compressor.decompress_hybrid(compressed_conv)

print("=== Conversation History Compression ===")
print(f"Original size: {len(conversation.encode('utf-8'))} bytes")
print(f"Compressed size: {len(compressed_conv)} bytes")
print(f"Space saved: {(1 - len(compressed_conv)/len(conversation.encode('utf-8')))*100:.1f}%")
print(f"Lossless: {decompressed_conv == conversation} âœ“")

### Example 3: Compressing LLM API Responses

Reduce storage when caching LLM responses:

In [None]:
compressor = PromptCompressor()

# Simulate LLM response
llm_response = """Machine Learning (ML) is a subset of artificial intelligence that enables 
systems to learn and improve from experience without being explicitly programmed. It focuses 
on developing algorithms that can access data and use it to learn patterns and make predictions 
or decisions. There are three main types of machine learning:
1. Supervised Learning: Uses labeled data to train models
2. Unsupervised Learning: Finds patterns in unlabeled data
3. Reinforcement Learning: Learns through trial and error with rewards/penalties"""

# Compress response
compressed_response = compressor.compress_hybrid(llm_response)
decompressed_response = compressor.decompress_hybrid(compressed_response)

print("=== API Response Compression ===")
print(f"Original size: {len(llm_response.encode('utf-8'))} bytes")
print(f"Compressed size: {len(compressed_response)} bytes")
print(f"Compression ratio: {len(llm_response.encode('utf-8'))/len(compressed_response):.2f}x")
print(f"Space saved: {(1 - len(compressed_response)/len(llm_response.encode('utf-8')))*100:.1f}%")
print(f"Lossless: {decompressed_response == llm_response} âœ“")

## Best Practices

### 1. Choose the Right Method

- **Hybrid** (Recommended): Best compression ratio, ideal for database storage
- **Token**: Use when you need token IDs anyway or working with LLM tokenizers
- **Zstd**: Fast and simple, good for general text compression

### 2. Select Appropriate Configuration

- **Tokenizer Model**: Use `cl100k_base` (default) for GPT-4 compatibility, or choose based on your LLM
- **Zstd Level**: Use 15 (default) for balanced performance, or adjust based on your speed/storage priorities

### 3. Maintain Consistency

- Use the same tokenizer model for compression and decompression
- Document your compression settings for reproducibility

### 4. Consider Your Use Case

- **Large-scale storage**: Use Hybrid method with zstd_level 15-19
- **Fast processing needed**: Use lower zstd_level (5-10) or Zstd method
- **Real-time applications**: Consider lower compression levels for speed

## Performance Benchmarks

Let's benchmark compression performance across different prompt sizes:

In [None]:
import time

compressor = PromptCompressor()

# Test prompts of different sizes
test_prompts = {
    "Small": "You are a helpful AI assistant.",
    "Medium": """You are a helpful AI assistant designed to provide accurate, 
detailed, and helpful responses to user queries. Your goal is to assist users 
by understanding their questions and providing relevant information.""",
    "Large": """You are a comprehensive AI assistant specializing in technical documentation 
and educational content. Your expertise spans multiple domains including computer science, 
data science, machine learning, software engineering, and web development. When responding 
to queries, you should provide thorough explanations, include relevant examples, and 
structure your responses in a clear and organized manner. Always aim to educate while 
solving problems. Break down complex concepts into digestible parts, use analogies when 
helpful, and provide practical applications of theoretical knowledge.""" * 3
}

print("=== Performance Benchmarks ===")
print("Testing all three compression methods:\n")

results = []

for size_name, prompt in test_prompts.items():
    original_size = len(prompt.encode('utf-8'))
    
    print(f"\n{size_name} Prompt ({original_size} bytes):")
    print("-" * 60)
    
    for method in [CompressionMethod.ZSTD, CompressionMethod.TOKEN, CompressionMethod.HYBRID]:
        # Time compression
        start = time.perf_counter()
        compressed = compressor.compress(prompt, method)
        compression_time = (time.perf_counter() - start) * 1000  # ms
        
        # Time decompression
        start = time.perf_counter()
        decompressed = compressor.decompress(compressed, method)
        decompression_time = (time.perf_counter() - start) * 1000  # ms
        
        # Verify losslessness
        is_lossless = decompressed == prompt
        
        compressed_size = len(compressed)
        space_saved = (1 - compressed_size/original_size) * 100
        
        results.append({
            'size': size_name,
            'method': method.value,
            'original': original_size,
            'compressed': compressed_size,
            'space_saved': space_saved,
            'compression_time': compression_time,
            'decompression_time': decompression_time,
            'lossless': is_lossless
        })
        
        print(f"{method.value:8s}: {compressed_size:4d} bytes, "
              f"{space_saved:5.1f}% saved, "
              f"compress: {compression_time:5.2f}ms, "
              f"decompress: {decompression_time:5.2f}ms, "
              f"Lossless: {'âœ“' if is_lossless else 'âœ—'}")

print("\nâœ… All benchmarks completed successfully!")

## Summary

LoPace provides powerful, lossless compression for prompts with:

- âœ… **Three compression methods** to choose from
- âœ… **Flexible configuration** options (tokenizer models, compression levels)
- âœ… **100% lossless** guarantees
- âœ… **Excellent compression ratios** (up to 80% space savings)
- âœ… **Production-ready** performance

### Quick Reference

```python
from lopace import PromptCompressor, CompressionMethod

# Initialize
compressor = PromptCompressor(model="cl100k_base", zstd_level=15)

# Compress (Hybrid recommended)
compressed = compressor.compress_hybrid("Your prompt here")

# Decompress
original = compressor.decompress_hybrid(compressed)

# Verify
assert original == "Your prompt here"  # âœ“ Always True (lossless)
```

### Next Steps

- Check out the [LoPace documentation](https://github.com/amanulla/lopace)
- Run the Streamlit app: `streamlit run streamlit_app.py`
- Explore the source code for advanced use cases

---

**Happy Compressing! ðŸš€**