Release 1.0.16 - Multi-Provider AI Support
Release 1.0.16 - Multi-Provider AI Support
🎉 Major Features
Multi-Provider AI Support
undatum now supports multiple AI providers for automatic field and dataset documentation:
- OpenAI - GPT-4o-mini, GPT-4o, GPT-3.5-turbo, and more
- OpenRouter - Unified API for accessing models from OpenAI, Anthropic, Google, and others
- Ollama - Run local models without API keys
- LM Studio - Local models via OpenAI-compatible API
- Perplexity - Backward compatible with existing Perplexity integration
Structured AI Output
- Replaced fragile text parsing with JSON Schema-based structured output
- More reliable AI response parsing
- Better error handling and fallback mechanisms
Flexible Configuration
Configure AI providers through:
- Environment variables (lowest precedence)
- Config files (
undatum.yamlor~/.undatum/config.yaml) - CLI arguments (highest precedence)
✨ What's New
Added
- Multi-provider AI support: Added support for OpenAI, OpenRouter, Ollama, LM Studio, and Perplexity APIs
- Structured AI output: Replaced fragile text parsing with JSON Schema-based structured output for reliable AI responses
- Flexible AI configuration: Support for environment variables, config files (
undatum.yamlor~/.undatum/config.yaml), and CLI arguments with proper precedence - AI provider factory: New
get_ai_service()function for easy provider instantiation - Enhanced error handling: Proper exception classes (
AIServiceError,AIConfigurationError,AIAPIError) with clear error messages - CLI arguments for AI: Added
--ai-provider,--ai-model, and--ai-base-urloptions toanalyzecommand - Configuration management: New
undatum/ai/config.pymodule for unified configuration handling - Backward compatibility: Old
get_fields_info()andget_description()functions maintained for compatibility - Enhanced code quality improvements and Pylint score improvements
- Better error handling and resource management
Changed
- AI system refactoring: Completely refactored AI documentation system from Perplexity-only to multi-provider architecture
- Structured responses: All AI providers now use JSON Schema (
response_format: json_object) instead of parsing CSV from markdown code blocks - Provider architecture: Implemented abstract base class
AIServicewith concrete provider implementations - Improved code quality: fixed indentation, trailing whitespace, and formatting issues
- Refactored file operations to use
withstatements for better resource management - Updated string formatting to use f-strings and lazy logging
- Fixed dangerous default arguments in function signatures
- Improved type hints and code documentation
- Updated
analyzecommand to accept AI provider configuration - Updated
schemercommand to use new AI service interface
Fixed
- Fixed critical bug: added missing
_process_json_datafunction in analyzer module - Fixed bad indentation issues in
duckdb_decomposefunction - Fixed redefined builtin
idparameter (renamed totable_id) - Fixed unused imports and arguments
- Fixed dictionary iteration patterns (removed unnecessary
.keys()calls) - Fixed
isinstance()calls to use tuple syntax for better performance - Improved file handling with proper context managers
- Fixed fragile AI response parsing: Replaced error-prone text extraction with proper JSON parsing
- Fixed AI service initialization: Added proper error handling and fallback when AI service fails to initialize
📦 Installation
pip install --upgrade undatum