Skip to content

chore: unified file domain model#380

Merged
Israeltheminer merged 2 commits into
mainfrom
chore/file-domain-model
Feb 6, 2026
Merged

chore: unified file domain model#380
Israeltheminer merged 2 commits into
mainfrom
chore/file-domain-model

Conversation

@Israeltheminer
Copy link
Copy Markdown
Collaborator

@Israeltheminer Israeltheminer commented Feb 6, 2026

Summary

  • Add lib/shared/file-types.ts as single source of truth for all file type definitions across the platform
  • Migrate 13 files to import MIME types, extensions, accept strings, size limits, and classification helpers from the shared registry
  • Remove ~120 lines of duplicated file type logic scattered across chat uploads, document uploads, import forms, attachment processing, and parse routing

What's in the registry

  • MIME constants: MIME_TYPES.PDF, MIME_TYPES.DOCX, etc.
  • Grouped sets: IMAGE_MIME_TYPES, DOCUMENT_MIME_TYPES, SPREADSHEET_MIME_TYPES
  • Classification: isImage(), isTextFile(), isSpreadsheet(), isParseable()
  • Extensions: extractExtension(), getDisplayExtension()
  • Accept strings: CHAT_UPLOAD_ACCEPT, DOCUMENT_UPLOAD_ACCEPT, SPREADSHEET_IMPORT_ACCEPT
  • Size limits: CHAT_MAX_FILE_SIZE, DOCUMENT_MAX_FILE_SIZE
  • Parse routing: getParseEndpoint()
  • Display labels: getFileTypeLabelKey()

Files migrated

File What was replaced
use-convex-file-upload.ts 10-line MIME array + hardcoded size limit
chat-input.tsx Hardcoded accept string + MIME-to-label chain
document-upload-dialog.tsx Long accept string + MAX_FILE_SIZE_BYTES
use-document-upload.ts MAX_FILE_SIZE_BYTES constant
vendor-import-form.tsx .endsWith() checks + hardcoded accept
product-import-form.tsx Same pattern
customer-import-form.tsx Same pattern
parse_file.ts 14-line getParseEndpoint()
extract_extension.ts 37-line function → re-export
process_attachments.ts Inline isTextFile + image classification
document-helpers.ts 25-line getFileExtension() → re-export
document-icon.tsx Local getExtension()
file-parsing.ts isExcelFile() → uses shared isSpreadsheet()

Test plan

  • Verify chat file upload accepts images, PDFs, Word, PowerPoint, and text files
  • Verify document upload dialog accepts all supported document types
  • Verify vendor/product/customer import forms accept .xlsx, .xls, .csv only
  • Verify document icons render correctly for all file types
  • Verify file parsing routes to correct endpoints (PDF, DOCX, PPTX)
  • Verify attachment processing correctly classifies images, text files, and documents

Summary by CodeRabbit

  • Refactor
    • Centralized file-handling configuration across the platform, including file type validation, size limits, and display labels.
    • Unified file upload logic for documents, chat attachments, spreadsheet imports, and other file-related features.
    • Standardized file type detection and validation to ensure consistent behavior throughout the application.

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Feb 6, 2026

📝 Walkthrough

Walkthrough

This PR centralizes file type handling by introducing a new shared file-types.ts module containing MIME type definitions, file classification helpers, upload accept strings, size limits, and parse endpoint routing. It then refactors multiple components and utilities across the codebase to use these centralized constants and helpers instead of hardcoded values or duplicated logic, moving file type validation, extension extraction, and MIME type checking from scattered locations into a single source of truth.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related PRs

🚥 Pre-merge checks | ✅ 2 | ❌ 1
❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 35.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'chore: unified file domain model' accurately describes the main change: consolidating file-type logic into a centralized shared module.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch chore/file-domain-model

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🤖 Fix all issues with AI agents
In `@services/platform/lib/shared/file-types.ts`:
- Around line 91-94: The isParseable and getParseEndpoint functions disagree on
allowed extensions which can cause silent fallback to the PDF parser; update
isParseable and the parsing-extension checks in getParseEndpoint to use the
exact same normalized extension set (include or exclude .doc/.ppt consistently)
and add a defensive processLogger/console.warn inside getParseEndpoint (or its
caller) to log when an unexpected extension is received and the code is falling
back to the PDF route; reference the isParseable function and getParseEndpoint
function names to locate where to align the extension list and insert the
warning.
- Around line 43-69: The MIME sets (IMAGE_MIME_TYPES, DOCUMENT_MIME_TYPES,
PRESENTATION_MIME_TYPES, SPREADSHEET_MIME_TYPES, TEXT_MIME_TYPES) are strict
allow-lists while the helper functions (isImage, isTextFile) use broader
heuristics (e.g., startsWith('image/') and extension checks); add brief doc
comments above each exported ReadonlySet (at least above IMAGE_MIME_TYPES and
TEXT_MIME_TYPES) explaining that the sets are for strict
validation/allow-listing and that isImage() and isTextFile() intentionally
perform looser classification for routing/UX, so they are not interchangeable —
mention examples like image/svg+xml and .txt/.log to clarify intent.
- Around line 218-225: Update getFileTypeLabelKey to detect spreadsheet MIME
types so they don't fall through to 'file': use the existing
SPREADSHEET_MIME_TYPES (and/or MIME_TYPES.XLS, MIME_TYPES.XLSX, MIME_TYPES.CSV)
to check if mimeType matches any spreadsheet entry (e.g.,
SPREADSHEET_MIME_TYPES.includes(mimeType) or compare against MIME_TYPES
constants) and return an appropriate label such as 'spreadsheet' (or specific
keys like 'xls'/'xlsx'/'csv' if you prefer); add this check before the final
return so spreadsheet MIME types are handled by getFileTypeLabelKey.

Comment thread services/platform/lib/shared/file-types.ts
Comment thread services/platform/lib/shared/file-types.ts
Comment thread services/platform/lib/shared/file-types.ts
Add lib/shared/file-types.ts as single source of truth for MIME types,
file extensions, accept strings, size limits, classification helpers,
and parse endpoint routing. Migrate 13 files to use the shared registry,
removing ~120 lines of duplicated file type definitions.
@Israeltheminer Israeltheminer force-pushed the chore/file-domain-model branch from 30350ed to 8787c9b Compare February 6, 2026 16:29
@Israeltheminer Israeltheminer merged commit 095fa6f into main Feb 6, 2026
2 checks passed
@Israeltheminer Israeltheminer deleted the chore/file-domain-model branch February 6, 2026 16:37
yannickmonney pushed a commit that referenced this pull request Apr 8, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant