Skip to content

Release Notes v0.2.2

Choose a tag to compare

@zhixiangxue zhixiangxue released this 02 Dec 13:19
· 78 commits to main since this release

Release Notes - v0.2.2

Release Date: December 2, 2024

What's New

Multimodal Conversation Support

Conversation now supports multimodal inputs through the attachments parameter. Send images, audio, video, and documents alongside your text messages.

Key Features:

  • Flexible Input Formats: All attachment types accept local file paths, remote URLs, or base64-encoded data URIs
  • Rich Media Support:
    • Images (JPEG, PNG, GIF, WEBP)
    • Audio (WAV, MP3, OGG)
    • Video (MP4, WEBM)
  • Document Processing:
    • PDF documents
    • Word files (DOC, DOCX)
    • Excel spreadsheets (XLS, XLSX)
    • CSV data files
    • Plain text and Markdown files
  • Web Content: Analyze web pages via URL links

Usage Example:

from chak import Conversation, Image, PDF

conv = Conversation("openai/gpt-4o", api_key="YOUR_KEY")

# Analyze an image
response = await conv.asend(
    "What's in this image?",
    attachments=[Image("photo.jpg")]  # Supports local path, URL, or base64
)

# Process a document
response = await conv.asend(
    "Summarize this report",
    attachments=[PDF("report.pdf")],
    timeout=120
)

# Multiple attachments
response = await conv.asend(
    "Compare these images",
    attachments=[
        Image("https://example.com/img1.jpg"),
        Image("./local/img2.png")
    ]
)

See Examples:

Documentation:

  • Updated README with comprehensive multimodal support section
  • Added usage examples for all supported file types

Technical Details

  • Attachment Classes: New attachment types exported from chak package: Image, Audio, Video, PDF, DOC, Excel, CSV, TXT, Link
  • MimeType Support: Explicit MIME type specification available via MimeType enum
  • Custom Readers: Built-in readers for all document types, with support for custom reader functions
  • Streaming Compatible: Multimodal inputs work seamlessly with streaming responses

Notes

  • Not all LLM providers support all modalities. Check your provider's documentation for capabilities
  • Large files may require longer timeouts. Use the timeout parameter when needed
  • Both send() and asend() support attachments, but async is recommended for better performance with large files

Installation

Update to the latest version:

pip install --upgrade chakpy

# With all optional dependencies
pip install --upgrade chakpy[all]

For full details, see the updated README.