# EncypherAI Basic Usage

This notebook demonstrates the basic usage of EncypherAI for encoding and decoding metadata in text.

EncypherAI allows you to embed invisible metadata in text using Unicode variation selectors, making it perfect for tracking the provenance of LLM-generated content.

## Installation

First, install the EncypherAI package:

In [None]:
# Uncomment to install
# !pip install encypher-ai

## Basic Encoding and Decoding

Let's start with the basic encoding and decoding functionality using the `UnicodeMetadata` class:

In [None]:
from encypher.core.unicode_metadata import UnicodeMetadata, MetadataTarget
from datetime import datetime, timezone

# Sample text
text = "This is a sample text that will have metadata embedded in it."

# Metadata to embed
model_id = "gpt-4"
timestamp = datetime.now(timezone.utc).isoformat()
custom_metadata = {
    "request_id": "req_12345",
    "user_id": "user_6789",
    "cost": 0.0023
}

# Embed metadata
encoded_text = UnicodeMetadata.embed_metadata(
    text=text,
    model_id=model_id,
    timestamp=timestamp,
    target=MetadataTarget.WHITESPACE,  # Can also use "whitespace" as string
    custom_metadata=custom_metadata
)

print("Original text:")
print(text)
print("\nEncoded text (looks the same but contains metadata):")
print(encoded_text)

Now, let's extract the metadata from the encoded text:

In [None]:
# Extract metadata
extracted_metadata = UnicodeMetadata.extract_metadata(encoded_text)

print("Extracted metadata:")
for key, value in extracted_metadata.items():
    print(f"{key}: {value}")

## Using the MetadataEncoder

For more advanced use cases, you can use the `MetadataEncoder` class, which provides HMAC verification to ensure data integrity:

In [None]:
from encypher.core.metadata_encoder import MetadataEncoder
import json

# Initialize encoder with a secret key
encoder = MetadataEncoder(secret_key="your-secret-key")

# Sample text
text = "This text will have metadata embedded with HMAC verification."

# Metadata to embed
metadata = {
    "model_id": "gpt-4",
    "timestamp": datetime.now(timezone.utc).isoformat(),
    "request_id": "req_12345",
    "user_id": "user_6789",
    "cost": 0.0023
}

# Encode metadata
encoded_text = encoder.encode_metadata(text, metadata)

print("Original text:")
print(text)
print("\nEncoded text (looks the same but contains metadata with HMAC):")
print(encoded_text)

Now, let's verify and decode the metadata:

In [None]:
# Verify and decode metadata
is_valid, extracted_metadata, clean_text = encoder.verify_text(encoded_text)

print(f"Is valid: {is_valid}")
print("\nExtracted metadata:")
print(json.dumps(extracted_metadata, indent=2))
print("\nClean text:")
print(clean_text)

## Streaming Support

EncypherAI provides support for streaming responses from LLMs. Let's see how to use the `StreamingHandler` class:

In [None]:
from encypher.streaming.handlers import StreamingHandler

# Initialize streaming handler
handler = StreamingHandler(
    metadata={
        "model_id": "gpt-4",
        "timestamp": datetime.now(timezone.utc).isoformat(),
        "request_id": "req_12345"
    },
    target=MetadataTarget.WHITESPACE,
    encode_first_chunk_only=True  # Only encode the first chunk
)

# Simulate streaming chunks
chunks = [
    "This is the first",
    " chunk of a streaming",
    " response from an LLM.",
    " Metadata will be embedded",
    " in the appropriate places."
]

# Process each chunk
processed_chunks = []
for chunk in chunks:
    processed_chunk = handler.process_chunk(chunk)
    processed_chunks.append(processed_chunk)
    print(f"Processed chunk: {processed_chunk}")

# Combine all chunks
full_text = "".join(processed_chunks)
print("\nFull text:")
print(full_text)

# Extract metadata from full text
extracted_metadata = UnicodeMetadata.extract_metadata(full_text)
print("\nExtracted metadata:")
print(json.dumps(extracted_metadata, indent=2))

## Integration with LiteLLM

EncypherAI can be easily integrated with LiteLLM to work with various LLM providers:

In [None]:
# Uncomment and run this cell if you have API keys configured
"""
import os
import litellm

# Set your API key
os.environ["OPENAI_API_KEY"] = "your-openai-api-key"  # Replace with your actual key

# Initialize streaming handler
handler = StreamingHandler(
    metadata={
        "model_id": "gpt-3.5-turbo",
        "timestamp": datetime.now(timezone.utc).isoformat(),
        "request_id": "req_12345"
    },
    target=MetadataTarget.WHITESPACE
)

# Define messages
messages = [
    {"role": "system", "content": "You are a helpful assistant."},
    {"role": "user", "content": "Write a short paragraph about AI ethics."}
]

# Generate streaming response
response = litellm.completion(
    model="gpt-3.5-turbo",
    messages=messages,
    stream=True
)

# Process streaming chunks
full_response = ""
for chunk in response:
    if hasattr(chunk.choices[0], 'delta') and hasattr(chunk.choices[0].delta, 'content'):
        content = chunk.choices[0].delta.content
        if content:
            processed_chunk = handler.process_chunk(content)
            full_response += processed_chunk
            print(processed_chunk, end="")

print("\n\nExtracted metadata:")
print(json.dumps(UnicodeMetadata.extract_metadata(full_response), indent=2))
"""

## Conclusion

This notebook demonstrated the basic usage of EncypherAI for encoding and decoding metadata in text. EncypherAI provides a simple yet powerful way to embed invisible metadata in LLM-generated content, enabling provenance tracking, attribution, and verification.

For more advanced usage and examples, please refer to the [EncypherAI documentation](https://docs.encypherai.com).