Skip to content

dev#63

Merged
excoffierleonard merged 6 commits intomainfrom
dev
Jan 23, 2026
Merged

dev#63
excoffierleonard merged 6 commits intomainfrom
dev

Conversation

@excoffierleonard
Copy link
Owner

  • feat: add document parsing functionality for various formats
  • Update dependencies and refactor web server functionality
  • refactor: clean up Dockerfile and remove unnecessary comments; update entrypoint for parser
  • refactor: remove common test utilities and replace with direct file path handling in tests
  • refactor: remove obsolete benchmark, build, and deployment test scripts
  • feat: add CI/CD workflow for Docker image build, publish, and deployment

- Implemented DOCX parser using docx_rs for extracting text from Microsoft Word documents.
- Added image parser utilizing Tesseract OCR for text extraction from images (PNG, JPEG, WebP).
- Created PDF parser using pdf_extract for extracting text from PDF documents.
- Developed PPTX parser for extracting text from Microsoft PowerPoint presentations.
- Introduced XLSX parser using calamine for extracting text from Excel spreadsheets.
- Added plain text parser for handling UTF-8 encoded text files, including TXT, CSV, and JSON formats.
- Established a web API using Actix for file parsing, supporting multipart file uploads.
- Implemented error handling for API responses with appropriate status codes.
- Added tests for all parsers and API endpoints to ensure functionality and correctness.
- Included assets for testing various file formats in the tests directory.
- Updated dependencies in Cargo.toml for improved performance and security.
- Changed description and categories in Cargo.toml for clarity.
- Refactored main.rs to simplify server initialization and remove unnecessary conditionals.
- Renamed web module documentation to reflect web server functionality.
- Updated routes documentation to clarify purpose.
- Simplified static file serving logic in static_files.rs, improving error handling and response structure.
@excoffierleonard excoffierleonard merged commit b2c494a into main Jan 23, 2026
3 checks passed
@excoffierleonard excoffierleonard deleted the dev branch January 23, 2026 23:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant