# Question-Specific Compression Tutorial

Learn how to compress context **based on a specific question** for optimal relevance.

## 📚 Available Models

| Model | Type | Description |
|-------|------|-------------|
| **qs_gemfilter_v1** | Token-level | Query-specific using GemFilter attention | 
| **qs_sat_v1** | Binary filter | Segment Any Text hierarchical filtering | 

### Key Differences:
- **GemFilter**: Token-level compression, keeps tokens relevant to query, controllable ratio
- **SaT**: Binary filter, keeps/removes entire sentences/paragraphs - no ratio control

## 1. Installation

In [15]:
# Install from local source in editable mode
%pip install -e ../ python-dotenv

Obtaining file:///Users/oussama/Documents/Cmprsr/codebase/Compresr-SDK-Private/python
  Installing build dependencies ... [?25ldone
[?25h  Checking if build backend supports build_editable ... [?25ldone
[?25h  Getting requirements to build editable ... [?25ldone
[?25h  Preparing editable metadata (pyproject.toml) ... [?25ldone
Building wheels for collected packages: compresr
  Building editable for compresr (pyproject.toml) ... [?25ldone
[?25h  Created wheel for compresr: filename=compresr-1.1.1-0.editable-py3-none-any.whl size=5409 sha256=ff907e0925223fdc9f14b1760badfaddda6bb92921aba350c37ad11666f58e23
  Stored in directory: /private/var/folders/g4/rzc_8mgd15n5f0l73k0mm3600000gn/T/pip-ephem-wheel-cache-tidln8cy/wheels/b8/66/33/38ffb0bd34b6589c6f5f9d07abdc3fcffb34736427d2cdac72
Successfully built compresr
Installing collected packages: compresr
  Attempting uninstall: compresr
    Found existing installation: compresr 1.1.1
    Uninstalling compresr-1.1.1:
      Successfully u

## 2. Setup

In [None]:
import os
from dotenv import load_dotenv
from compresr import QSCompressor, MODELS

# Load environment variables
env_path = "../../.env"
load_dotenv(env_path)

api_key = os.getenv("COMPRESR_API_KEY")

# For local testing
os.environ["COMPRESR_BASE_URL"] = "http://localhost:8000"

# Initialize client
client = QSCompressor(api_key=api_key)

print("✅ Client initialized!")
print(f"🌐 Base URL: {os.getenv('COMPRESR_BASE_URL', 'https://api.compresr.ai')}")
print("\n📚 Available QS Models:")
print(f"   - {MODELS.QS_GEMFILTER}: GemFilter (default) - supports compression_ratio")
print(f"   - {MODELS.QS_SAT}: SaT - binary filter, NO compression_ratio")

✅ Client initialized!
🌐 Base URL: http://localhost:8000

📚 Available QS Models:
   - qs_gemfilter_v1: GemFilter (default) - supports compression_ratio
   - qs_sat_v1: SaT - binary filter, NO compression_ratio


---
# Part 1: GemFilter Model (qs_gemfilter_v1)

**Token-level compression** - Extracts only tokens relevant to your query

## 1. Basic GemFilter Compression

In [None]:
# Large context with multiple topics
context = """
Machine learning is a comprehensive subset of artificial intelligence that enables computer 
systems to automatically learn and improve from experience without explicit programming. 
The field encompasses supervised learning, unsupervised learning, and reinforcement learning.

Natural language processing (NLP) is a branch of AI dealing with the interaction between 
computers and humans using natural language. Modern NLP uses transformer architectures like 
BERT and GPT for tasks such as translation, sentiment analysis, and question answering.

Computer vision enables machines to derive meaningful information from digital images and 
videos. It uses convolutional neural networks (CNNs) for tasks like object detection, facial 
recognition, and autonomous vehicle navigation.

Reinforcement learning is where agents learn to behave optimally by performing actions and 
receiving rewards or penalties. It has achieved remarkable success in game playing (AlphaGo, 
chess) and robotics control applications.
"""

query = "What is machine learning and how does it work?"
target_ratio = 0.5

print("🤖 Model: qs_gemfilter_v1 (GemFilter)")
print("🔍 Token-level compression with attention mechanisms\n")

response = client.compress(
    context=context.strip(),
    query=query,
    compression_model_name=MODELS.QS_GEMFILTER,
    target_compression_ratio=target_ratio
)

print("✅ GemFilter compression successful!")
print(f"\n❓ Query: {query}")
print("\n📊 Results:")
print(f"   Original tokens: {response.data.original_tokens}")
print(f"   Compressed tokens: {response.data.compressed_tokens}")
print(f"   Tokens saved: {response.data.tokens_saved}")
print(f"   Target ratio: {target_ratio:.1%}")
print(f"   Actual ratio: {response.data.actual_compression_ratio:.1%}")
print("\n📝 Compressed context (relevant to query):")
print(response.data.compressed_context)

🤖 Model: qs_gemfilter_v1 (GemFilter)
🔍 Token-level compression with attention mechanisms

✅ GemFilter compression successful!

❓ Query: What is machine learning and how does it work?

📊 Results:
   Original tokens: 177
   Compressed tokens: 66
   Tokens saved: 111
   Target ratio: 50.0%
   Actual ratio: 62.7%

📝 Compressed context (relevant to query):
Machine learning is a comprehensive subset of artificial intelligence that enables computer 
systems to automatically learn and improve from experience without explicit programming. The field encompasses supervised learning, unsupervised learning, and reinforcement learning. Natural language processing (NLP) is a branch of AI dealing with the interaction between 
computAlphaGo, 
chess)


## 2. GemFilter with Different Compression Ratios

Control how much content to keep with `target_compression_ratio`:
- **0.3** → Keep 30% of tokens, remove 70% (aggressive compression)
- **0.5** → Keep 50% of tokens, remove 50% (medium compression)  
- **0.7** → Keep 70% of tokens, remove 30% (light compression)

In [18]:
context = """
Python is a high-level programming language created by Guido van Rossum in 1991.
It emphasizes code readability and supports multiple paradigms including OOP and functional.
Python is widely used in web development, data science, machine learning, and automation.

JavaScript was created by Brendan Eich in 1995 for web browsers.
It has evolved into a full-stack language with Node.js for server-side development.
JavaScript powers interactive websites and modern web applications.

Java follows the "write once, run anywhere" principle through the JVM.
It's widely used in enterprise applications and Android development.
Java enforces strong typing and automatic memory management.
"""

query = "What is Python used for?"

print("🎯 Testing Different Compression Ratios\n")

for ratio in [0.3, 0.5, 0.7]:
    response = client.compress(
        context=context.strip(),
        query=query,
        compression_model_name=MODELS.QS_GEMFILTER,
        target_compression_ratio=ratio
    )
    
    keep_pct = ratio * 100
    remove_pct = (1 - ratio) * 100
    print(f"{'='*60}")
    print(f"Target Ratio {ratio:.1f} → Keep {keep_pct:.0f}%, Remove {remove_pct:.0f}%")
    print(f"{'='*60}")
    print(f"Tokens: {response.data.original_tokens} → {response.data.compressed_tokens}")
    print(f"Actual Ratio: {response.data.actual_compression_ratio:.1%} (kept {response.data.compressed_tokens} of {response.data.original_tokens})")
    print(f"Result: {response.data.compressed_context}\n")

🎯 Testing Different Compression Ratios

Target Ratio 0.3 → Keep 30%, Remove 70%
Tokens: 128 → 25
Actual Ratio: 80.5% (kept 25 of 128)
Result: . Python is widely used in web development, data science, machine learning, and automation. JavaScript was created by Brendan Eich

Target Ratio 0.5 → Keep 50%, Remove 50%
Tokens: 128 → 42
Actual Ratio: 67.2% (kept 42 of 128)
Result: Python is a high-level programming language created OOP and functional. Python is widely used in web development, data science, machine learning, and automation. JavaScript was created by Brendan Eich in 1995 for

Target Ratio 0.7 → Keep 70%, Remove 30%
Tokens: 128 → 72
Actual Ratio: 43.8% (kept 72 of 128)
Result: Python is a high-level programming language created by Guido van Rossum in 1991. It emphasizes code readability and supports multiple paradigms including OOP and functional. Python is widely used in web development, data science, machine learning, and automation. JavaScript was created by Brendan Eich in 19

## 3. GemFilter Adapts to Different Queries

Same context, different queries → different results

In [19]:
queries = [
    "Who created Python and when?",
    "What is JavaScript used for?",
    "What is Java's main principle?"
]

target_ratio = 0.5
print(f"💡 Same context, different queries (target ratio: {target_ratio:.1%}):\n")

for q in queries:
    response = client.compress(
        context=context.strip(),
        query=q,
        compression_model_name=MODELS.QS_GEMFILTER,
        target_compression_ratio=target_ratio,
    )
    print(f"❓ Query: {q}")
    print(f"📝 Result: {response.data.compressed_context}")
    print(f"   Saved: {response.data.tokens_saved} tokens")
    print(f"   Target: {target_ratio:.1%} | Actual: {response.data.actual_compression_ratio:.1%}\n")

💡 Same context, different queries (target ratio: 50.0%):

❓ Query: Who created Python and when?
📝 Result: Python is a high-level programming language created by Guido van Rossum in 1991. It emphasizes code readability and supports multiple paradigms including OOP and,. JavaScript
   Saved: 93 tokens
   Target: 50.0% | Actual: 72.7%

❓ Query: What is JavaScript used for?
📝 Result: Pythonich 1995 for web browsers. It has evolved into a full-stack language with Node.js for server-side development. JavaScript powers interactive websites and modern web applications. Java follows the "write once, run
   Saved: 86 tokens
   Target: 50.0% | Actual: 67.2%

❓ Query: What is Java's main principle?
📝 Result: Python and modern web applications. Java follows the "write once, run anywhere" principle through the JVM. It's widely used in enterprise applications and Android development. Java enforces strong typing and automatic memory management
   Saved: 87 tokens
   Target: 50.0% | Actual: 68.0%



## 4. GemFilter Batch Compression

Compress multiple contexts in a single API call with GemFilter

In [20]:
inputs = [
    {
        "context": "Machine learning uses algorithms to learn from data. Deep learning uses neural networks. NLP processes human language.",
        "query": "What is machine learning?"
    },
    {
        "context": "The solar system has 8 planets. Jupiter is the largest. Mars is called the Red Planet.",
        "query": "How many planets are there?"
    },
    {
        "context": "Python was created by Guido van Rossum in 1991. It emphasizes readability and supports OOP.",
        "query": "Who created Python?"
    },
]

target_ratio = 0.5
print("🤖 GemFilter Batch Compression\n")

response = client.compress_batch(
    inputs=inputs,
    compression_model_name=MODELS.QS_GEMFILTER,
    target_compression_ratio=target_ratio,
)

print(f"✅ Compressed {len(response.data.results)} contexts")
print(f"   Target ratio: {target_ratio:.1%}")
print(f"   Total tokens saved: {response.data.total_tokens_saved}\n")

for i, result in enumerate(response.data.results):
    print(f"📝 Context {i+1}: Query = '{inputs[i]['query']}'")
    print(f"   Result: {result.compressed_context}")
    print(f"   Tokens: {result.original_tokens} → {result.compressed_tokens}")
    print(f"   Target: {target_ratio:.1%} | Actual: {result.actual_compression_ratio:.1%}\n")

🤖 GemFilter Batch Compression

✅ Compressed 3 contexts
   Target ratio: 50.0%
   Total tokens saved: 4

📝 Context 1: Query = 'What is machine learning?'
   Result: Machine learning uses algorithms to learn from data. Deep learning uses neural networks. NLP processes human language.
   Tokens: 21 → 21
   Target: 50.0% | Actual: 0.0%

📝 Context 2: Query = 'How many planets are there?'
   Result: The solar system has 8 planets. Jupiter is the largest. Mars is called the Red Planet.
   Tokens: 20 → 20
   Target: 50.0% | Actual: 0.0%

📝 Context 3: Query = 'Who created Python?'
   Result: Python was created by Guido van Rossum in 1991. It emphasizes readability and
   Tokens: 22 → 18
   Target: 50.0% | Actual: 18.2%



## 5. GemFilter Error Handling

Common errors specific to GemFilter compression

In [21]:
from compresr.exceptions import ValidationError

# Error 1: Invalid compression ratio (out of range)
print("1️⃣  Invalid compression ratio...")
try:
    response = client.compress(
        context="Test context",
        query="Test query",
        compression_model_name=MODELS.QS_GEMFILTER,
        target_compression_ratio=1.5,  # ❌ Must be 0.0-1.0
    )
except ValidationError as e:
    print(f"   ✅ Caught: {e}\n")

# Error 2: Negative compression ratio
print("2️⃣  Negative compression ratio...")
try:
    response = client.compress(
        context="Test context",
        query="Test query",
        compression_model_name=MODELS.QS_GEMFILTER,
        target_compression_ratio=-0.5,  # ❌ Negative value
    )
except ValidationError as e:
    print(f"   ✅ Caught: {e}\n")

# Error 3: Empty query
print("3️⃣  Empty query...")
try:
    response = client.compress(
        context="Test context",
        query="",  # ❌ Empty query
        compression_model_name=MODELS.QS_GEMFILTER,
    )
except ValidationError as e:
    print(f"   ✅ Caught: {e}\n")

# Error 4: Empty context  
print("4️⃣  Empty context...")
try:
    response = client.compress(
        context="",  # ❌ Empty context
        query="Test query",
        compression_model_name=MODELS.QS_GEMFILTER,
    )
except ValidationError as e:
    print(f"   ✅ Caught: {e}\n")

print("✅ GemFilter error scenarios tested!")

1️⃣  Invalid compression ratio...
   ✅ Caught: 1 validation error for CompressRequest
target_compression_ratio
  Input should be less than or equal to 0.9 [type=less_than_equal, input_value=1.5, input_type=float]
    For further information visit https://errors.pydantic.dev/2.12/v/less_than_equal

2️⃣  Negative compression ratio...
   ✅ Caught: 1 validation error for CompressRequest
target_compression_ratio
  Input should be greater than or equal to 0.1 [type=greater_than_equal, input_value=-0.5, input_type=float]
    For further information visit https://errors.pydantic.dev/2.12/v/greater_than_equal

3️⃣  Empty query...
   ✅ Caught: 1 validation error for CompressRequest
query
  String should have at least 1 character [type=string_too_short, input_value='', input_type=str]
    For further information visit https://errors.pydantic.dev/2.12/v/string_too_short

4️⃣  Empty context...
   ✅ Caught: 1 validation error for CompressRequest
context
  Value error, context must not be empty [type

---
# Part 2: SaT Model (qs_sat_v1)

**Binary filter** - Keeps/removes entire sentences or paragraphs based on relevance

⚠️ **Important**: SaT does NOT support `target_compression_ratio`!

## 1. Basic SaT Compression

In [None]:
context = """
Machine learning enables computers to learn from experience without explicit programming.
It encompasses supervised learning, unsupervised learning, and reinforcement learning.
The algorithms identify patterns in data to make predictions.

Natural language processing deals with computer-human language interaction.
Modern NLP uses transformer models like BERT and GPT.
Applications include translation, sentiment analysis, and chatbots.

Computer vision enables machines to interpret visual information.
It uses CNNs for object detection and image classification.
Applications include autonomous vehicles and medical imaging.
"""

query = "What is machine learning?"

print("🤖 Model: qs_sat_v1 (SaT)")
print("⚠️  Binary filter - NO target ratio (keeps/removes entire sentences)\n")

response = client.compress(
    context=context.strip(),
    query=query,
    compression_model_name=MODELS.QS_SAT,
    # NOTE: Do NOT pass target_compression_ratio for SaT!
)

print("✅ SaT compression successful!")
print(f"\n❓ Query: {query}")
print("\n📊 Results:")
print(f"   Original tokens: {response.data.original_tokens}")
print(f"   Compressed tokens: {response.data.compressed_tokens}")
print(f"   Tokens saved: {response.data.tokens_saved}")
print("   Target ratio: N/A (binary filter)")
print(f"   Actual ratio: {response.data.actual_compression_ratio:.1%}")
print("\n📝 Filtered content (relevant sentences only):")
print(response.data.compressed_context)

🤖 Model: qs_sat_v1 (SaT)
⚠️  Binary filter - NO target ratio (keeps/removes entire sentences)

✅ SaT compression successful!

❓ Query: What is machine learning?

📊 Results:
   Original tokens: 98
   Compressed tokens: 98
   Tokens saved: 0
   Target ratio: N/A (binary filter)
   Actual ratio: 0.0%

📝 Filtered content (relevant sentences only):
Machine learning enables computers to learn from experience without explicit programming. It encompasses supervised learning, unsupervised learning, and reinforcement learning. The algorithms identify patterns in data to make predictions. Natural language processing deals with computer-human language interaction. Modern NLP uses transformer models like BERT and GPT. Applications include translation, sentiment analysis, and chatbots. Computer vision enables machines to interpret visual information. It uses CNNs for object detection and image classification. Applications include autonomous vehicles and medical imaging.


## 2. SaT Adapts to Different Queries

Same context, different queries → different sentences kept

In [23]:
sat_context = """
Machine learning enables computers to learn from experience without explicit programming.
It encompasses supervised learning, unsupervised learning, and reinforcement learning.

Natural language processing deals with computer-human language interaction.
Modern NLP uses transformer models like BERT and GPT for text understanding.

Computer vision enables machines to interpret visual information.
CNNs are commonly used for object detection and image classification.
"""

queries_sat = [
    "What is machine learning?",
    "What is NLP?",
    "What is computer vision?"
]

print("💡 SaT keeps different sentences based on query:\n")

for q in queries_sat:
    response = client.compress(
        context=sat_context.strip(),
        query=q,
        compression_model_name=MODELS.QS_SAT,
    )
    print(f"❓ Query: {q}")
    print(f"📝 Kept sentences: {response.data.compressed_context}")
    print(f"   Actual ratio: {response.data.actual_compression_ratio:.1%}\n")

💡 SaT keeps different sentences based on query:

❓ Query: What is machine learning?
📝 Kept sentences: Machine learning enables computers to learn from experience without explicit programming. It encompasses supervised learning, unsupervised learning, and reinforcement learning. Natural language processing deals with computer-human language interaction. Modern NLP uses transformer models like BERT and GPT for text understanding. Computer vision enables machines to interpret visual information. CNNs are commonly used for object detection and image classification.
   Actual ratio: 0.0%

❓ Query: What is NLP?
📝 Kept sentences: Machine learning enables computers to learn from experience without explicit programming. It encompasses supervised learning, unsupervised learning, and reinforcement learning. Natural language processing deals with computer-human language interaction. Modern NLP uses transformer models like BERT and GPT for text understanding. Computer vision enables machines to int

## 3. SaT: Error When Using Compression Ratio

SaT is a **binary filter** - it will raise a `ValidationError` if you try `target_compression_ratio`

In [24]:
from compresr.exceptions import ValidationError

try:
    response = client.compress(
        context="Test content",
        query="Test query",
        compression_model_name=MODELS.QS_SAT,
        target_compression_ratio=0.5,  # ❌ This will fail!
    )
except ValidationError as e:
    print(f"✅ Expected error: {e}")
    print("\n💡 SaT is a binary filter - remove target_compression_ratio parameter")

✅ Expected error: Model 'qs_sat_v1' is a binary filter and does not support 'target_compression_ratio'. Remove this parameter for SaT models.

💡 SaT is a binary filter - remove target_compression_ratio parameter


## 4. SaT Batch Compression

Batch compression with SaT (no ratio parameter)

In [25]:
sat_batch_inputs = [
    {
        "context": "Machine learning uses algorithms to learn from data. Deep learning uses neural networks. NLP processes human language.",
        "query": "What is machine learning?"
    },
    {
        "context": "The solar system has 8 planets. Jupiter is the largest. Mars is called the Red Planet.",
        "query": "How many planets are there?"
    },
]

print("🤖 SaT Batch Compression (no ratio parameter)\n")

response = client.compress_batch(
    inputs=sat_batch_inputs,
    compression_model_name=MODELS.QS_SAT,
    # NOTE: No target_compression_ratio for SaT!
)

print(f"✅ Compressed {len(response.data.results)} contexts with SaT")
print(f"   Total tokens saved: {response.data.total_tokens_saved}\n")

for i, result in enumerate(response.data.results):
    print(f"📝 Context {i+1}: Query = '{sat_batch_inputs[i]['query']}'")
    print(f"   Kept: {result.compressed_context}")
    print(f"   Tokens: {result.original_tokens} → {result.compressed_tokens}")
    print(f"   Actual ratio: {result.actual_compression_ratio:.1%}\n")

🤖 SaT Batch Compression (no ratio parameter)

✅ Compressed 2 contexts with SaT
   Total tokens saved: 0

📝 Context 1: Query = 'What is machine learning?'
   Kept: Machine learning uses algorithms to learn from data. Deep learning uses neural networks. NLP processes human language.
   Tokens: 21 → 21
   Actual ratio: 0.0%

📝 Context 2: Query = 'How many planets are there?'
   Kept: The solar system has 8 planets. Jupiter is the largest. Mars is called the Red Planet.
   Tokens: 20 → 20
   Actual ratio: 0.0%



---
## 5. Async Operations

Async support works with both GemFilter and SaT models

In [26]:
async def compare_models_async():
    context = "Quantum computers use qubits that can exist in superposition. Classical computers use binary bits. Quantum entanglement enables faster computation."
    query = "How do quantum computers work?"
    target_ratio = 0.5
    
    # GemFilter (with ratio)
    print("🤖 Testing GemFilter (async)...")
    response_gem = await client.compress_async(
        context=context,
        query=query,
        compression_model_name=MODELS.QS_GEMFILTER,
        target_compression_ratio=target_ratio,
    )
    print(f"✅ GemFilter: {response_gem.data.tokens_saved} tokens saved")
    print(f"   Target ratio: {target_ratio:.1%}")
    print(f"   Actual ratio: {response_gem.data.actual_compression_ratio:.1%}")
    print(f"📝 Result: {response_gem.data.compressed_context}\n")
    
    # SaT (no ratio - binary filter)
    print("🤖 Testing SaT (async)...")
    print("   ⚠️  SaT is binary filter - no target ratio")
    response_sat = await client.compress_async(
        context=context,
        query=query,
        compression_model_name=MODELS.QS_SAT,
    )
    print(f"✅ SaT: {response_sat.data.tokens_saved} tokens saved")
    print(f"   Actual ratio: {response_sat.data.actual_compression_ratio:.1%}")
    print(f"📝 Result: {response_sat.data.compressed_context}")

await compare_models_async()

🤖 Testing GemFilter (async)...
✅ GemFilter: 7 tokens saved
   Target ratio: 50.0%
   Actual ratio: 25.9%
📝 Result: Quantum computers use qubits that can exist in superposition. Classical computers use binary bits. Quantum

🤖 Testing SaT (async)...
   ⚠️  SaT is binary filter - no target ratio
✅ SaT: 0 tokens saved
   Actual ratio: 0.0%
📝 Result: Quantum computers use qubits that can exist in superposition. Classical computers use binary bits. Quantum entanglement enables faster computation.


---
## 6. General Error Handling

Common error scenarios and how to handle them

In [27]:
from compresr.exceptions import (
    CompresrError,
    AuthenticationError,
    RateLimitError,
    ValidationError
)

# Basic example
try:
    response = client.compress(
        context="Test context",
        query="What is this about?",
        compression_model_name=MODELS.QS_GEMFILTER,
    )
    print("✅ Success!")
except AuthenticationError:
    print("❌ Invalid API key")
except RateLimitError:
    print("⏳ Rate limit exceeded")
except ValidationError as e:
    print(f"❌ Invalid input: {e}")
except CompresrError as e:
    print(f"❌ API error: {e}")

✅ Success!


### 6.1 Common Error Scenarios

Test how the SDK handles various error conditions specific to QS compression:

In [None]:
from compresr import QSCompressor
# Error 1: Unsupported model name
print("1️⃣  Testing unsupported model name...")
try:
    response = client.compress(
        context="Test context",
        query="Test query",
        compression_model_name="invalid_model_v1",  # ❌ Invalid model
    )
    print("   ❌ Unexpected success - should have raised an error\n")
except ValidationError as e:
    print(f"   ✅ Caught ValidationError: {e}\n")
except Exception as e:
    print(f"   ⚠️  Caught {type(e).__name__}: {e}\n")

# Error 2: Missing required query parameter
print("2️⃣  Testing empty query parameter...")
try:
    response = client.compress(
        context="Test context",
        query="",  # ❌ Empty query
        compression_model_name=MODELS.QS_GEMFILTER,
    )
    print("   ❌ Unexpected success - should have raised an error\n")
except ValidationError as e:
    print(f"   ✅ Caught ValidationError: {e}\n")
except Exception as e:
    print(f"   ⚠️  Caught {type(e).__name__}: {e}\n")

# Error 3: SaT model with compression_ratio (unsupported combination)
print("3️⃣  Testing SaT with compression_ratio (unsupported)...")
try:
    response = client.compress(
        context="Test context",
        query="Test query",
        compression_model_name=MODELS.QS_SAT,
        target_compression_ratio=0.5,  # ❌ SaT doesn't support ratio!
    )
    print("   ❌ Unexpected success - should have raised an error\n")
except ValidationError as e:
    print(f"   ✅ Caught ValidationError: {e}\n")
except Exception as e:
    print(f"   ⚠️  Caught {type(e).__name__}: {e}\n")

# Error 4: Invalid compression ratio (out of range)
print("4️⃣  Testing invalid compression ratio...")
try:
    response = client.compress(
        context="Test context",
        query="Test query",
        compression_model_name=MODELS.QS_GEMFILTER,
        target_compression_ratio=-0.5,  # ❌ Must be 0.0-1.0
    )
    print("   ❌ Unexpected success - should have raised an error\n")
except ValidationError as e:
    print(f"   ✅ Caught ValidationError: {e}\n")
except Exception as e:
    print(f"   ⚠️  Caught {type(e).__name__}: {e}\n")

# Error 5: Invalid API key format (caught at client creation)
print("5️⃣  Testing invalid API key format...")
try:
    bad_client = QSCompressor(api_key="sk-invalid-key")  # ❌ Wrong prefix
    response = bad_client.compress(
        context="Test context",
        query="Test query",
        compression_model_name=MODELS.QS_GEMFILTER,
    )
    print("   ❌ Unexpected success - should have raised an error\n")
except AuthenticationError as e:
    print(f"   ✅ Caught AuthenticationError: {e}\n")
except Exception as e:
    print(f"   ⚠️  Caught {type(e).__name__}: {e}\n")

# Error 6: Using Agnostic model with QSCompressor (wrong client)
print("6️⃣  Testing wrong model for client type...")
try:
    response = client.compress(
        context="Test context",
        query="Test query",
        compression_model_name=MODELS.AGNOSTIC_LINGUA,  # ❌ Agnostic model with QS client
    )
    print("   ❌ Unexpected success - should have raised an error\n")
except ValidationError as e:
    print(f"   ✅ Caught ValidationError: {e}\n")
except Exception as e:
    print(f"   ⚠️  Caught {type(e).__name__}: {e}\n")

# Error 7: Empty context
print("7️⃣  Testing empty context...")
try:
    response = client.compress(
        context="",  # ❌ Empty context
        query="Test query",
        compression_model_name=MODELS.QS_GEMFILTER,
    )
    print("   ❌ Unexpected success - should have raised an error\n")
except ValidationError as e:
    print(f"   ✅ Caught ValidationError: {e}\n")
except Exception as e:
    print(f"   ⚠️  Caught {type(e).__name__}: {e}\n")

print("✅ All error scenarios tested!")

1️⃣  Testing unsupported model name...
   ✅ Caught ValidationError: Model 'invalid_model_v1' is not valid for QSCompressor. Allowed models: qs_gemfilter_v1, qs_sat_v1

2️⃣  Testing empty query parameter...
   ✅ Caught ValidationError: 1 validation error for CompressRequest
query
  String should have at least 1 character [type=string_too_short, input_value='', input_type=str]
    For further information visit https://errors.pydantic.dev/2.12/v/string_too_short

3️⃣  Testing SaT with compression_ratio (unsupported)...
   ✅ Caught ValidationError: Model 'qs_sat_v1' is a binary filter and does not support 'target_compression_ratio'. Remove this parameter for SaT models.

4️⃣  Testing invalid compression ratio...
   ✅ Caught ValidationError: 1 validation error for CompressRequest
target_compression_ratio
  Input should be greater than or equal to 0.1 [type=greater_than_equal, input_value=-0.5, input_type=float]
    For further information visit https://errors.pydantic.dev/2.12/v/greater_tha

## 7. Next Steps

- 📖 Docs: [docs.compresr.ai](https://docs.compresr.ai)
- 💬 Support: founders@compresr.ai
- 🚀 Try agnostic compression: `agnostic_compression_tutorial.ipynb`