-
Notifications
You must be signed in to change notification settings - Fork 0
Description
Enhancement Roadmap: Nutrient DWS Python Client
Overview
This issue tracks the comprehensive enhancement plan for the Nutrient DWS Python Client based on OpenAPI specification v1.9.0 analysis. The goal is to expand from ~30% to ~80% API coverage while maintaining our high standards for code quality and backward compatibility.
Enhancement Categories
🔵 Priority 1: Enhanced Existing Methods
Improve current methods with additional OpenAPI capabilities
- feat: Core components - exceptions, file handling, and HTTP client #1 Multi-Language OCR Support - Support multiple languages in
ocr_pdf()
- build(deps): bump codecov/codecov-action from 4 to 5 #2 Image Watermark Support - Add image watermarks to
watermark_pdf()
- Add support for missing Nutrient DWS API tools #3 Selective Annotation Flattening - Add annotation ID filtering to
flatten_annotations()
🟢 Priority 2: Core Missing Methods
Add commonly requested document operations
- Minor cleanup #4 Create Redactions - Implement
create_redactions()
with text/regex/preset strategies - Add integration tests to CI workflow for pull requests #5 Import Annotations - Implement
import_annotations()
for Instant JSON/XFDF - docs: add PyPI badges and changelog #6 Extract Page Range - Simple
extract_pages()
method (simpler than split_pdf)
🟡 Priority 3: Format Conversion Methods
Enable output format flexibility
- feat: integrate fork features with comprehensive Direct API methods a… #7 Convert to PDF/A - Implement
convert_to_pdfa()
for archival compliance - Test with keyring token #8 Convert to Images - Implement
convert_to_images()
for PNG/JPEG/WebP - Enhancement Roadmap: Comprehensive Feature Plan #9 Extract Content as JSON - Implement
extract_content()
for structured data - Enhancement: Multi-Language OCR Support #10 Convert to Office Formats - Implement
convert_to_office()
for DOCX/XLSX/PPTX
🟠 Priority 4: Advanced Features
Sophisticated document processing capabilities
- Enhancement: Image Watermark Support #11 AI-Powered Redaction - Implement
ai_redact()
using AI entity detection - Enhancement: Selective Annotation Flattening #12 Digital Signatures - Implement
sign_pdf()
with visual signatures - Feature: Create Redactions Method #13 Batch Processing - Client-side
batch_process()
for bulk operations
Implementation Timeline
Phase 1 (Weeks 1-4)
Focus on Priority 1 enhancements that improve existing methods:
- Multi-language OCR
- Image watermarks
- Selective flattening
Phase 2 (Weeks 5-8)
Add Priority 2 core methods:
- Create redactions
- Import annotations
- PDF/A conversion
Phase 3 (Weeks 9-12)
Implement Priority 3 format conversions:
- Image extraction
- Content extraction
- Office format export
Phase 4 (Weeks 13-16)
Advanced features for Priority 4:
- AI redaction
- Digital signatures
- Batch processing
Success Metrics
- API Coverage: Increase from ~30% to ~80%
- Test Coverage: Maintain 95%+ coverage
- Documentation: 100% method documentation with examples
- Performance: Sub-second operations for common tasks
- Backward Compatibility: Zero breaking changes
Implementation Guidelines
For each enhancement:
- Review OpenAPI specification for exact requirements
- Implement with backward compatibility in mind
- Add comprehensive unit and integration tests
- Include detailed docstrings with examples
- Update documentation and changelog
- Consider performance implications
Related Documents
- FUTURE_ENHANCEMENTS_PLAN.md - Detailed enhancement specifications
- OPENAPI_COMPLIANCE_REVIEW.md - Current compliance status
- openapi_spec.yml - Official API specification v1.9.0
Contributing
We welcome contributions! Please:
- Comment on the specific issue you'd like to work on
- Follow the implementation template in each issue
- Ensure all tests pass
- Update documentation
- Submit PR referencing the issue number
Questions?
Feel free to ask questions in the comments or open a discussion for broader topics.
Labels: roadmap, enhancement, meta-issue
Milestone: v2.0.0