Skip to content

Kaan kien tiffany/feature/docs editing adeu#260

Open
kien-ship-it wants to merge 9 commits intomainfrom
Kaan-Kien-Tiffany/feature/docs-editing-adeu
Open

Kaan kien tiffany/feature/docs editing adeu#260
kien-ship-it wants to merge 9 commits intomainfrom
Kaan-Kien-Tiffany/feature/docs-editing-adeu

Conversation

@kien-ship-it
Copy link
Copy Markdown
Collaborator

@kien-ship-it kien-ship-it commented Apr 5, 2026

Adeu DOCX Redlining Integration

image

Integrates the Adeu automated DOCX redlining engine (v0.9.0) into LaunchStack. LLM agents can now read, edit, diff, and accept changes in Word documents via HTTP, with edits appearing as native Track Changes.

What's included

Python sidecar (5 endpoints)

  • POST /adeu/read — extract text from DOCX (with optional clean view)
  • POST /adeu/process-batch — apply edits + review actions as tracked changes
  • POST /adeu/accept-all — accept all tracked changes, return clean DOCX
  • POST /adeu/apply-edits-markdown — preview edits as CriticMarkup markdown
  • POST /adeu/diff — text-based diff between two DOCX files
  • Health check extended to report Adeu availability and version

TypeScript adapter (src/lib/adeu/client.ts, src/lib/adeu/types.ts)

  • Typed async functions wrapping each sidecar endpoint
  • ADEU_SERVICE_URL environment-based routing (Docker → sidecar, Vercel → serverless function, local → localhost)
  • AdeuConfigError / AdeuServiceError error classes

Inngest background job (src/server/inngest/functions/modifyDocument.ts)

  • document/modify.requested event handler with 3 retryable steps (fetch → modify → store)
  • Serial processing (concurrency: 1), 422 validation errors marked as failed without retry

Vercel Python serverless function (api/adeu/index.py)

  • Mirrors all 5 sidecar endpoints, no torch/ML deps, fits under 250MB Vercel limit

Docker Compose wiring

  • Sidecar service with health check, ADEU_SERVICE_URL injected into app container

LLM tool-calling integration test (scripts/test-adeu-llm-loop.ts)

  • End-to-end test: real LLM (gpt-4o) autonomously reads a SAFE template, previews edits, then applies them via processDocumentBatch, producing a modified DOCX with Track Changes on disk

Test coverage

Suite File Tests What it covers
Sidecar routes (Python) sidecar/tests/test_adeu_routes.py 10 All 5 endpoints + health check, valid/invalid inputs, error codes
Adapter unit tests __tests__/api/adeu/adapter.test.ts 47 All adapter functions, FormData construction, error handling, config errors
Adapter PBT — round-trip __tests__/api/adeu/adapter-roundtrip.pbt.test.ts 6 Property 11: request/response round-trip for all adapters
Adapter PBT — errors __tests__/api/adeu/adapter-errors.pbt.test.ts 6 Property 12: error propagation for random HTTP status codes
Inngest function __tests__/api/adeu/modifyDocument.test.ts 16 Registration, concurrency, step execution, 422 handling, DB updates, route registration
LLM integration scripts/test-adeu-llm-loop.ts manual Full agent loop: LLM → read → preview → apply → DOCX output

Run tests:

# TypeScript (75 tests)
npx jest __tests__/api/adeu/

# Python sidecar (10 tests)
PYTHONPATH=sidecar /tmp/sidecar-test-venv/bin/python -m pytest sidecar/tests/ -v

# LLM integration (requires running sidecar + OPENAI_API_KEY)
/tmp/sidecar-test-venv/bin/python scripts/run-adeu-sidecar.py &
npx tsx scripts/test-adeu-llm-loop.ts

Key files changed

src/lib/adeu/types.ts                              # TypeScript interfaces
src/lib/adeu/client.ts                             # TypeScript adapter
src/server/inngest/client.ts                       # DocumentModifyEvent type
src/server/inngest/functions/modifyDocument.ts      # Inngest function
src/app/api/inngest/route.ts                       # Route registration
sidecar/app/schemas/adeu.py                        # Shared Pydantic schemas
sidecar/app/routes/adeu.py                         # FastAPI endpoints
sidecar/app/main.py                                # Router + health check
sidecar/Dockerfile                                 # Separate Adeu pip layer
sidecar/requirements.txt                           # adeu==0.9.0
api/adeu/index.py                                  # Vercel serverless function
api/adeu/requirements.txt                          # Vercel-only deps (no torch)
docker-compose.yml                                 # Sidecar service + env wiring
.env.example                                       # ADEU_SERVICE_URL docs
.vercelignore                                      # Exclude sidecar/ from Vercel
scripts/test-adeu-llm-loop.ts                      # LLM integration test
scripts/run-adeu-sidecar.py                        # Lightweight sidecar runner

LLM integration test evidence

The test at scripts/test-adeu-llm-loop.ts was run against the SAFE template at public/templates/safe-template.docx. The LLM (gpt-4o) autonomously executed the full tool-calling loop:

Artifacts produced:

  • Log: adeu-llm-test.log — full timestamped trace of every LLM decision, tool call, and tool result
  • Modified DOCX: test-output/safe-modified-1775385077680.docx — the actual output document with Track Changes

What the log shows (3 iterations):

Iteration 1: LLM calls read_docx(clean_view=false)
             → Sidecar returns 5,632 chars of document text

Iteration 2: LLM calls edit_document with 3 edits using unique surrounding context:
             - "Company: {company_name}, a" → "Company: Tech Innovators Inc., a"
             - "Investor: {investor_name}, with an address at" → "Investor: John Doe Ventures, with an address at"
             - 'The "Valuation Cap" for this SAFE is {valuation_cap}.' → 'The "Valuation Cap" for this SAFE is $10,000,000.'
             → Sidecar returns CriticMarkup preview (5,994 chars)

Iteration 3: LLM calls apply_edits(author_name="Contract Review Assistant", ...)
             → Sidecar applies all 3 edits as Track Changes
             → Summary: { applied_edits: 3, skipped_edits: 0 }
             → Modified DOCX written to test-output/

Verified output document contains native Track Changes:

Company: {--{company_name},--}{++Tech Innovators Inc.,++}{>>[Chg:5] Contract Review Assistant
Investor: {--{investor_name},--}{++John Doe Ventures,++}{>>[Chg:3] Contract Review Assistant
The "Valuation Cap" for this SAFE is {--{valuation_cap}.--}{++$10,000,000.++}{>>[Chg:1] Contract Review Assistant

The output DOCX can be opened in Word or Google Docs to see the redlines with author attribution.

kien-ship-it and others added 6 commits March 28, 2026 12:59
- Add Python schemas package with Pydantic models for document edits, review actions, and batch processing
- Create ADEU schema definitions for DocumentEdit, ReviewAction, and related request/response types
- Add TypeScript client library with service interface and error handling classes
- Create TypeScript type definitions matching Python schema structure for type safety
- Update sidecar requirements to include pydantic and adeu dependencies
- Organize ML dependencies with comments for clarity
- Establish foundation for document batch processing and review workflows
- Add Adeu pip package installation (v0.9.0) in separate Docker layer for independent caching
- Create new /adeu/* route module with endpoints for reading, editing, and processing DOCX files
- Implement POST /adeu/read endpoint to extract text from DOCX with clean view option
- Implement POST /adeu/process-batch endpoint to apply edits and review actions in single request
- Implement POST /adeu/accept-all endpoint to accept all changes and return clean DOCX
- Implement POST /adeu/reject-all endpoint to reject all changes and return clean DOCX
- Implement POST /adeu/diff endpoint to generate edit suggestions from text comparison
- Implement POST /adeu/apply-edits-markdown endpoint to apply edits to markdown representation
- Add Adeu schemas module with request/response models for type validation
- Update main.py to register adeu router and include Adeu availability in health check
- Enables document editing and change tracking workflows through unified API
- Add ADEU_SERVICE_URL environment variable configuration with Docker, local, and Vercel deployment options
- Update .vercelignore to exclude sidecar directory due to ML dependency size constraints
- Fix TypeScript type safety in modifyDocument tests with proper error type casting and unknown error handling
- Correct Blob initialization in adapter tests using Uint8Array wrapper for buffer compatibility
- Add test automation scripts for ADEU LLM loop testing and sidecar process management
- Include comprehensive test output and logging for ADEU service validation
- Update Docker Compose and Vercel configuration for improved service orchestration
@vercel
Copy link
Copy Markdown

vercel bot commented Apr 5, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
launch-stack Ready Ready Preview, Comment Apr 9, 2026 5:25am
pdr-ai-v2 Ready Ready Preview, Comment Apr 9, 2026 5:25am

@kien-ship-it kien-ship-it added the enhancement New feature or request label Apr 5, 2026
@kien-ship-it kien-ship-it linked an issue Apr 5, 2026 that may be closed by this pull request
- Update test execution timestamps to reflect latest test run (2026-04-05T20:48:43.125Z)
- Refine investor name target text matching to improve accuracy in document edits
- Add apply_edits tool call in LLM response to consolidate multiple edit operations
- Generate safe-modified DOCX output file from test execution
- Streamline iteration flow by consolidating edit operations in single tool call
JunzheShi0702 added a commit that referenced this pull request Apr 6, 2026
Implements GitHub issue #266 by connecting the legal pipeline editing layer
to Kien's Adeu DOCX redlining service.

Changes:
- Created /api/legal/apply-edits endpoint with Clerk auth and Zod validation
- Enhanced LegalDocumentEditor with Track Changes button and multi-step toast notifications
- Added comprehensive error handling with user-friendly messages
- Implemented 18 unit tests covering auth, validation, integration, errors, and edge cases
- Merged PR #260 (Adeu DOCX redlining integration)
Copy link
Copy Markdown
Owner

@Deodat-Lawson Deodat-Lawson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review: Kaan-Kien-Tiffany/feature/docs-editing-adeu

This branch adds an ADEU (document editing/redlining) feature across ~3,600 lines in 31 files.

What This Branch Does

  1. Python sidecar (sidecar/app/routes/adeu.py) — FastAPI endpoints for reading, diffing, redlining, and accepting changes in DOCX files using an adeu library
  2. Vercel serverless proxy (api/adeu/index.py) — Python handler that proxies ADEU requests for Vercel deployment
  3. TypeScript client (src/lib/adeu/client.ts) — Fetch-based client wrapping all sidecar endpoints
  4. Inngest function (src/server/inngest/functions/modifyDocument.ts) — Background job that fetches a DOCX, sends it to the sidecar for redlining, and saves the result
  5. Tests & scripts — Unit tests, property-based tests, integration scripts, and Docker Compose setup

Critical Issues

Severity Issue Location
Critical Missing API route — vercel.json rewrites /api/adeu/* to a non-existent handler vercel.json:8
Critical Base64 DOCX passed through Inngest step outputs will hit 4MB limit for any real document modifyDocument.ts:49-92
Critical Global concurrency limit of 1 serializes ALL users' doc edits modifyDocument.ts:20
High cgi.FieldStorage is deprecated (Python 3.11) and removed in Python 3.13 api/adeu/index.py:144-157
High No failure status tracking — failed docs are indistinguishable from successful ones modifyDocument.ts:29-30
High Sync DOCX processing blocks the async event loop in all sidecar handlers sidecar/app/routes/adeu.py
High No upload size limits — memory exhaustion DoS possible Both adeu.py routes and index.py

Security Issues

Severity Issue Location
High Internal exception details leaked to clients in all 500 paths Both Python files
High OPENAI_API_KEY in Docker build args — visible in image layer metadata docker-compose.yml:14-15
Medium No authentication on sidecar requests client.ts — all fetch calls
Medium Content-Disposition header injection via unsanitized filenames adeu.py:146, index.py:290
Medium Container runs as root (no USER directive) sidecar/Dockerfile
Medium adeu-llm-test.log committed — leaks developer filesystem path adeu-llm-test.log

Code Quality / Architecture

Severity Issue Location
Medium Uncaught JSON.parse on malformed x-batch-summary header client.ts:103
Medium No fetch timeouts — hung sidecar blocks until platform kills the function client.ts
Medium Side-effecting DB write inside memoized step.run — will re-execute on Inngest replay modifyDocument.ts:71-74
Medium ADEU import is unconditional but main.py has "degraded" health path that's dead code main.py:17
Medium Bare except Exception on import swallows real errors index.py:39
Low adeu installed twice in Dockerfile Dockerfile:13-19

Test Gaps

Issue Location
Inngest handler test is vacuousif (handler) guard silently skips test body if SDK shape changes modifyDocument.test.ts:145-146
Route registration test always passes (expect(true).toBe(true)) modifyDocument.test.ts:384-392
Missing negative test cases for /adeu/diff, /adeu/accept-all, /adeu/apply-edits-markdown test_adeu_routes.py
Hardcoded version 0.9.0 assertion will break on upgrade test-sidecar-integration.sh:62
test-output/*.docx binary files committed to the repo test-output/

Top Recommendations

  1. Fix the missing /api/adeu route or the Vercel deployment is broken
  2. Use Vercel Blob or S3 instead of base64 through Inngest steps to avoid the 4MB limit
  3. Key the concurrency limit by documentId instead of global limit: 1
  4. Add a status field to document DB updates so failures are trackable
  5. Remove adeu-llm-test.log and test-output/ from version control and add to .gitignore
  6. Wrap sync DOCX processing in asyncio.to_thread() in the sidecar
  7. Add fetch timeouts and sidecar auth in the TypeScript client

@kien-ship-it
Copy link
Copy Markdown
Collaborator Author

On it!

- Add infrastructure bug condition tests verifying fixes 1.8, 1.11, 1.12, 1.18
- Add modifyDocument bug condition tests covering fixes 1.1, 1.2, 1.4, 1.15, 1.19, 1.20
- Add modifyDocument preservation tests ensuring core functionality remains intact
- Add ADEU client bug condition and preservation tests for TypeScript integration
- Add Python sidecar preservation tests for ADEU route handling
- Add Python sidecar bug condition tests for ADEU service integration
- Update modifyDocument test suite with additional test cases
- Update ADEU client TypeScript implementation with improved error handling
- Update sidecar routes and configuration for enhanced ADEU service integration
- Update test infrastructure and Docker configuration for improved test execution
- Add *.log to .gitignore to prevent accidental log file commits
- Remove stale adeu-llm-test.log file from repository
- Improve test-sidecar-integration.sh script for better test coverage
@kien-ship-it
Copy link
Copy Markdown
Collaborator Author

Fix 1.1 — Blob storage for large DOCX modifyDocument.ts now calls putFile() inside step.run("modify-document") and returns { summary, blobUrl: stored.url } instead of base64.

Fix 1.2 — Per-document concurrency key concurrency: [{ limit: 1, key: "event.data.documentId" }] — key field present.

Fix 1.3 — Replace cgi.FieldStorage
index.py
no longer imports cgi. Uses manual boundary parsing via _parse_multipart() with regex-based Content-Type boundary extraction.

Fix 1.4 — Failure status tracking onFailure handler sets ocrMetadata: { error: "editing_failed", errorMessage, failedAt }. Validation error path also writes failure metadata via step.run("record-validation-failure").

Fix 1.5 — asyncio.to_thread All 5 routes in adeu.py wrap sync adeu calls in asyncio.to_thread() — read_docx, process_batch (_run_batch), accept_all (_run_accept_all), apply_edits_markdown (_run_apply_markdown), diff_docx (_run_diff).

Fix 1.6 — Upload size validation Both files have MAX_UPLOAD_SIZE = 50 * 1024 * 1024. adeu.py checks in _read_upload(). index.py checks content-length in handler() before routing.

Fix 1.7 — Generic error responses All exception handlers in both adeu.py and index.py now return "Internal server error" instead of f"Internal error: {exc}". Server-side logging via logger.exception().

Fix 1.8 — Remove secret from build args docker-compose.yml x-app-build-args no longer contains OPENAI_API_KEY. Comment explains why. OPENAI_API_KEY remains in app service environment.

Fix 1.9 — Sidecar authentication client.ts has getAuthHeaders() returning { "X-API-Key": process.env.SIDECAR_API_KEY } on all fetch calls. adeu.py has verify_api_key dependency on all routes via dependencies=[Depends(verify_api_key)]. conftest.py sets SIDECAR_API_KEY env and injects header.

Fix 1.10 — Filename sanitization Both adeu.py and index.py have sanitize_filename() using re.sub(r'[\r\n";/\]', '_', name). Applied before all Content-Disposition headers.

Fix 1.11 — Non-root Docker user sidecar/Dockerfile has RUN useradd --create-home --shell /bin/bash sidecar and USER sidecar before CMD.

Fix 1.12 — Remove committed log, update .gitignore .gitignore now has *.log entry after the debug log entries.

Fix 1.13 — Safe JSON.parse client.ts wraps JSON.parse(summaryHeader) in try/catch with console.warn fallback to default summary.

Fix 1.14 — Fetch timeout client.ts has fetchWithTimeout() using AbortController + setTimeout with ADEU_TIMEOUT_MS (default 30s). All 5 adapter functions use it.

Fix 1.15 — Separate DB update step modifyDocument.ts has DB writes in separate step.run("update-document-record") and step.run("record-validation-failure") blocks, not inside the DOCX processing step.

Fix 1.16 — Conditional ADEU router import main.py wraps adeu_router import in try/except at registration time. Health endpoint returns degraded status when adeu unavailable.

Fix 1.17 — Narrow import exception handling index.py uses nested try/except: outer catches ImportError with fallback, inner catches ImportError with logger.warning, and both have except Exception with logger.error + raise.

Fix 1.18 — Single adeu install sidecar/Dockerfile no longer has explicit pip install adeu==0.9.0. adeu==0.9.0 is in
requirements.txt
. Comment says "adeu is installed via requirements.txt".

Fix 1.19 — Explicit handler assertion modifyDocument.test.ts line: expect(handler).toBeDefined(); — no more if (handler) guard.

Fix 1.20 — Meaningful route registration test Last test now imports modifyDocument, asserts it's defined, and checks opts?.id === "modify-document". No more expect(true).toBe(true).

Fix 1.21 — Negative test cases test_adeu_routes.py has TestDiffErrors, TestAcceptAllErrors, and TestApplyEditsMarkdownErrors classes with missing file and invalid DOCX tests.

Fix 1.22 — Dynamic version check test-sidecar-integration.sh reads EXPECTED_ADEU_VER from
requirements.txt
via grep -E '^adeu==' instead of hardcoding 0.9.0.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Agentic PDF interaction

4 participants