Summary
Implement langextract tools for PraisonAI agents to create interactive HTML visualizations from text with highlighted extractions. This follows the architectural violation identified in PR #1424 where these tools were incorrectly placed in the core SDK.
Background
Currently being implemented in MervinPraison/PraisonAI#1424, but violates AGENTS.md architecture:
- Heavy optional dependency (langextract ~50MB+ with ML models)
- External integration tool (3rd party library)
- Not core SDK functionality
- Should be in praisonai-tools per AGENTS.md Section 2.2
Required Implementation
1. Langextract Extract Tool
from praisonai_tools import tool
@tool
def langextract_extract(
text: str,
extractions: Optional[List[str]] = None,
document_id: str = "agent-analysis",
output_path: Optional[str] = None,
auto_open: bool = False
) -> Dict[str, Any]:
"""Extract and annotate text using langextract for interactive visualization.
Creates an interactive HTML document with highlighted extractions that can be
viewed in a browser. Useful for text analysis, entity extraction, and
document annotation workflows.
"""
# Implementation details...
2. Langextract Render File Tool
@tool
@require_approval(risk_level="high")
def langextract_render_file(
file_path: str,
extractions: Optional[List[str]] = None,
output_path: Optional[str] = None,
auto_open: bool = False
) -> Dict[str, Any]:
"""Read a text file and create langextract visualization."""
# Implementation details...
Technical Requirements
Dependencies
langextract (optional dependency)
- Lazy imports with graceful degradation
- Clear error messages when not installed
API Compliance
- Use correct langextract API:
lx.data.AnnotatedDocument
lx.data.CharInterval(start_pos=X, end_pos=Y)
lx.io.save_annotated_documents() + lx.visualize()
- Follow praisonai-tools patterns (BaseTool, @tool decorator)
Security
- File operations require approval (
@require_approval)
- Input validation for text and file paths
- Cross-platform file URI handling (
Path.as_uri())
Integration with PraisonAI
Agent Usage
from praisonaiagents import Agent
from praisonai_tools import langextract_extract, langextract_render_file
agent = Agent(
name="text_analyzer",
instructions="Analyze text and create interactive visualizations",
tools=[langextract_extract, langextract_render_file]
)
response = agent.start("Analyze this contract text and highlight key terms")
Installation
pip install praisonai-tools[langextract]
# or
pip install praisonai-tools langextract
Files to Create
praisonai_tools/tools/langextract_tool.py - Main implementation
tests/test_langextract_tool.py - Unit tests with langextract installed
- Update
praisonai_tools/tools/__init__.py - Export tools
- Update
pyproject.toml - Add langextract optional dependency
examples/langextract_example.py - Usage example
Success Criteria
- ✅ Tools work with PraisonAI agents
- ✅ Graceful degradation without langextract installed
- ✅ Interactive HTML generation with extractions highlighting
- ✅ File I/O with security approval
- ✅ Cross-platform compatibility
- ✅ Unit tests with real agentic tests
- ✅ Documentation and examples
Reference Implementation
The current implementation in MervinPraison/PraisonAI#1424 can be used as a starting point, but needs:
- API fixes (correct langextract usage)
- Proper approval decorator usage
- Architecture compliance (move to praisonai-tools)
Fixes #1421 (PraisonAI follow-up 3 - langextract tools)
Priority: High - Blocks MervinPraison/PraisonAI#1424 architectural compliance
Assignee: Please assign to someone familiar with praisonai-tools patterns
Summary
Implement langextract tools for PraisonAI agents to create interactive HTML visualizations from text with highlighted extractions. This follows the architectural violation identified in PR #1424 where these tools were incorrectly placed in the core SDK.
Background
Currently being implemented in MervinPraison/PraisonAI#1424, but violates AGENTS.md architecture:
Required Implementation
1. Langextract Extract Tool
2. Langextract Render File Tool
Technical Requirements
Dependencies
langextract(optional dependency)API Compliance
lx.data.AnnotatedDocumentlx.data.CharInterval(start_pos=X, end_pos=Y)lx.io.save_annotated_documents()+lx.visualize()Security
@require_approval)Path.as_uri())Integration with PraisonAI
Agent Usage
Installation
pip install praisonai-tools[langextract] # or pip install praisonai-tools langextractFiles to Create
praisonai_tools/tools/langextract_tool.py- Main implementationtests/test_langextract_tool.py- Unit tests with langextract installedpraisonai_tools/tools/__init__.py- Export toolspyproject.toml- Add langextract optional dependencyexamples/langextract_example.py- Usage exampleSuccess Criteria
Reference Implementation
The current implementation in MervinPraison/PraisonAI#1424 can be used as a starting point, but needs:
Fixes #1421 (PraisonAI follow-up 3 - langextract tools)
Priority: High - Blocks MervinPraison/PraisonAI#1424 architectural compliance
Assignee: Please assign to someone familiar with praisonai-tools patterns