chore: unified file domain model by Israeltheminer · Pull Request #380 · tale-project/tale

Israeltheminer · 2026-02-06T16:00:39Z

Summary

Add lib/shared/file-types.ts as single source of truth for all file type definitions across the platform
Migrate 13 files to import MIME types, extensions, accept strings, size limits, and classification helpers from the shared registry
Remove ~120 lines of duplicated file type logic scattered across chat uploads, document uploads, import forms, attachment processing, and parse routing

What's in the registry

MIME constants: MIME_TYPES.PDF, MIME_TYPES.DOCX, etc.
Grouped sets: IMAGE_MIME_TYPES, DOCUMENT_MIME_TYPES, SPREADSHEET_MIME_TYPES
Classification: isImage(), isTextFile(), isSpreadsheet(), isParseable()
Extensions: extractExtension(), getDisplayExtension()
Accept strings: CHAT_UPLOAD_ACCEPT, DOCUMENT_UPLOAD_ACCEPT, SPREADSHEET_IMPORT_ACCEPT
Size limits: CHAT_MAX_FILE_SIZE, DOCUMENT_MAX_FILE_SIZE
Parse routing: getParseEndpoint()
Display labels: getFileTypeLabelKey()

Files migrated

File	What was replaced
`use-convex-file-upload.ts`	10-line MIME array + hardcoded size limit
`chat-input.tsx`	Hardcoded accept string + MIME-to-label chain
`document-upload-dialog.tsx`	Long accept string + `MAX_FILE_SIZE_BYTES`
`use-document-upload.ts`	`MAX_FILE_SIZE_BYTES` constant
`vendor-import-form.tsx`	`.endsWith()` checks + hardcoded accept
`product-import-form.tsx`	Same pattern
`customer-import-form.tsx`	Same pattern
`parse_file.ts`	14-line `getParseEndpoint()`
`extract_extension.ts`	37-line function → re-export
`process_attachments.ts`	Inline `isTextFile` + image classification
`document-helpers.ts`	25-line `getFileExtension()` → re-export
`document-icon.tsx`	Local `getExtension()`
`file-parsing.ts`	`isExcelFile()` → uses shared `isSpreadsheet()`

Test plan

Verify chat file upload accepts images, PDFs, Word, PowerPoint, and text files
Verify document upload dialog accepts all supported document types
Verify vendor/product/customer import forms accept .xlsx, .xls, .csv only
Verify document icons render correctly for all file types
Verify file parsing routes to correct endpoints (PDF, DOCX, PPTX)
Verify attachment processing correctly classifies images, text files, and documents

Summary by CodeRabbit

Refactor
- Centralized file-handling configuration across the platform, including file type validation, size limits, and display labels.
- Unified file upload logic for documents, chat attachments, spreadsheet imports, and other file-related features.
- Standardized file type detection and validation to ensure consistent behavior throughout the application.

coderabbitai · 2026-02-06T16:05:34Z

📝 Walkthrough

Walkthrough

This PR centralizes file type handling by introducing a new shared file-types.ts module containing MIME type definitions, file classification helpers, upload accept strings, size limits, and parse endpoint routing. It then refactors multiple components and utilities across the codebase to use these centralized constants and helpers instead of hardcoded values or duplicated logic, moving file type validation, extension extraction, and MIME type checking from scattered locations into a single source of truth.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related PRs

refactor(platform): update file upload components #344: Modifies file upload flows in the same files (use-convex-file-upload, chat-input, import forms) to centralize file-type handling and upload accept constants.
feat: add file attachment support for chat agent with document parsing #12: Touches the file parsing helper (parse_file) and chat attachment parsing paths that are also refactored in this PR.
fix: ui improvements #377: Updates chat file-upload/display logic and file-type utilities in overlapping files like chat-input.tsx and file-type extraction helpers.

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 35.00% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title 'chore: unified file domain model' accurately describes the main change: consolidating file-type logic into a centralized shared module.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing touches

📝 Generate docstrings

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch chore/file-domain-model

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 3

🤖 Fix all issues with AI agents

In `@services/platform/lib/shared/file-types.ts`:
- Around line 91-94: The isParseable and getParseEndpoint functions disagree on
allowed extensions which can cause silent fallback to the PDF parser; update
isParseable and the parsing-extension checks in getParseEndpoint to use the
exact same normalized extension set (include or exclude .doc/.ppt consistently)
and add a defensive processLogger/console.warn inside getParseEndpoint (or its
caller) to log when an unexpected extension is received and the code is falling
back to the PDF route; reference the isParseable function and getParseEndpoint
function names to locate where to align the extension list and insert the
warning.
- Around line 43-69: The MIME sets (IMAGE_MIME_TYPES, DOCUMENT_MIME_TYPES,
PRESENTATION_MIME_TYPES, SPREADSHEET_MIME_TYPES, TEXT_MIME_TYPES) are strict
allow-lists while the helper functions (isImage, isTextFile) use broader
heuristics (e.g., startsWith('image/') and extension checks); add brief doc
comments above each exported ReadonlySet (at least above IMAGE_MIME_TYPES and
TEXT_MIME_TYPES) explaining that the sets are for strict
validation/allow-listing and that isImage() and isTextFile() intentionally
perform looser classification for routing/UX, so they are not interchangeable —
mention examples like image/svg+xml and .txt/.log to clarify intent.
- Around line 218-225: Update getFileTypeLabelKey to detect spreadsheet MIME
types so they don't fall through to 'file': use the existing
SPREADSHEET_MIME_TYPES (and/or MIME_TYPES.XLS, MIME_TYPES.XLSX, MIME_TYPES.CSV)
to check if mimeType matches any spreadsheet entry (e.g.,
SPREADSHEET_MIME_TYPES.includes(mimeType) or compare against MIME_TYPES
constants) and return an appropriate label such as 'spreadsheet' (or specific
keys like 'xls'/'xlsx'/'csv' if you prefer); add this check before the final
return so spreadsheet MIME types are handled by getFileTypeLabelKey.

Add lib/shared/file-types.ts as single source of truth for MIME types, file extensions, accept strings, size limits, classification helpers, and parse endpoint routing. Migrate 13 files to use the shared registry, removing ~120 lines of duplicated file type definitions.

coderabbitai Bot requested changes Feb 6, 2026

View reviewed changes

Comment thread services/platform/lib/shared/file-types.ts

Comment thread services/platform/lib/shared/file-types.ts

Comment thread services/platform/lib/shared/file-types.ts

Israeltheminer force-pushed the chore/file-domain-model branch from 30350ed to 8787c9b Compare February 6, 2026 16:29

fix: handle spreadsheet MIME types in getFileTypeLabelKey

80305df

coderabbitai Bot approved these changes Feb 6, 2026

View reviewed changes

Israeltheminer merged commit 095fa6f into main Feb 6, 2026
2 checks passed

Israeltheminer deleted the chore/file-domain-model branch February 6, 2026 16:37

coderabbitai Bot mentioned this pull request Feb 7, 2026

fix: UX polish - cursor-pointer, right-align, aria-labels #391

Merged

3 tasks

coderabbitai Bot mentioned this pull request Mar 4, 2026

fix(platform): restrict file uploads to agent's enabled document tools (#651) #672

Merged

3 tasks

yannickmonney pushed a commit that referenced this pull request Apr 8, 2026

chore: unified file domain model (#380)

a632c22

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

chore: unified file domain model#380

chore: unified file domain model#380
Israeltheminer merged 2 commits into
mainfrom
chore/file-domain-model

Israeltheminer commented Feb 6, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

coderabbitai Bot commented Feb 6, 2026

Walkthrough

Estimated code review effort

Possibly related PRs

Uh oh!

coderabbitai Bot left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

Israeltheminer commented Feb 6, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

What's in the registry

Files migrated

Test plan

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented Feb 6, 2026

Walkthrough

Estimated code review effort

Possibly related PRs

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Israeltheminer commented Feb 6, 2026 •

edited by coderabbitai Bot

Loading