feat(python-recipes): Due Diligence Agent on Deep Agents + Parallel by NormallyGaussian · Pull Request #29 · parallel-web/parallel-cookbook

NormallyGaussian · 2026-04-28T22:35:27Z

Summary

Multi-agent due diligence recipe built on LangChain's Deep Agents harness and Parallel's Task API.

One-sentence pitch: a research agent that reasons over its own confidence (Parallel Basis) and chains follow-up queries (previous_interaction_id) when a finding is uncertain.

DD shows up everywhere in financial services — banks, insurance, PE, credit, compliance, corp dev, VC. The same architecture (subagent decomposition + Basis-aware research + confidence-driven follow-ups + per-competitor fan-out) generalizes to KYB, vendor risk, M&A target evaluation, etc.

What it shows

ParallelTaskRunTool + parse_basis — structured per-entity research with per-field citations and calibrated confidence; the wrapper surfaces low_confidence_warning so subagent reasoning can decide to chain a follow-up via previous_interaction_id.
Deep Agents canonical fan-out — Phase 2 spawns one competitor-analysis subagent instance per competitor identified by competitive-landscape.
Disk-backed FilesystemBackend with virtual_mode=True — every workpaper and the synthesized memo persist to ./reports/workpapers/ on local disk so the artifact is auditable, not ephemeral.
ParallelWebSearchTool as orchestrator-level quick lookup for ad-hoc cross-reference verification.

Validated run

End-to-end on Rivian Automotive at the default core-fast processor:

14:36 wall-clock
10 Task API calls (2 chained, 3 of those Phase-2 fan-out)
9 workpaper files persisted (167KB total): corporate-profile.md, financial-health.md, litigation-regulatory.md, news-reputation.md, competitive-landscape.md, competitor-tesla.md, competitor-ford.md, competitor-mercedes.md, plus the 33KB synthesized memo rivian-due-diligence-report.md
The synthesized memo includes TOC, exec summary with overall risk rating, full per-section detail with inline source URLs (SEC EDGAR, etc.), and a per-competitor comparative section drawn from the three Phase-2 fan-out workpapers

The Rivian run is committed under reports/workpapers/ so cookbook readers can preview the artifact shape without running the agent themselves.

Files

python-recipes/parallel-deepagents-due-diligence/
├── README.md
├── agent.py                   # orchestrator + 6 subagents + tool wrappers
├── due_diligence.ipynb        # 15-cell walkthrough
├── langgraph.json             # for `uv run langgraph dev`
├── pyproject.toml + requirements.txt
├── .env.example + .gitignore
├── sample_output_rivian.md    # orchestrator's final assistant message
└── reports/workpapers/        # full Rivian DD output, 9 files / 167KB

Registration

Listed in top-level README.md under "Deep Research & Notebooks"
Listed in website/cookbook.json with slug parallel-deepagents-due-diligence, tags deep-research, task, search, python, deepagents, langchain

Status

Draft. End-to-end validated. No live demo URL yet (recipe is local-run only — cookbook reader runs uv run python agent.py after setting their own keys).

Test plan

uv venv && uv pip install -e python-recipes/parallel-deepagents-due-diligence/
cp .env.example .env and fill in ANTHROPIC_API_KEY + PARALLEL_API_KEY
uv run python agent.py — produces sample_output_rivian.md and persists 9 files to reports/workpapers/
uv run langgraph dev — agent loads under graph id due_diligence
No secrets / credentials committed (verified via repo-wide grep before push)

Multi-agent due diligence recipe built on LangChain's Deep Agents harness and Parallel's Task API. The agent runs in three phases: - Phase 1 (parallel) — corporate-profile, financial-health, litigation-regulatory, news-reputation, competitive-landscape - Phase 2 (fan-out) — orchestrator dispatches one competitor-analysis subagent instance per competitor identified by competitive-landscape - Phase 3 — orchestrator reads all workpapers, cross-references for contradictions, synthesizes the final memo with comparative competitor section Recipe demonstrates the patterns: - ParallelTaskRunTool + parse_basis: structured per-entity research with per-field citations and calibrated confidence; the wrapper surfaces low_confidence_warning so subagent reasoning can decide to chain a follow-up via previous_interaction_id - Deep Agents canonical fan-out: spawning N instances of the same subagent type for N parallel investigations - Disk-backed FilesystemBackend(virtual_mode=True): every workpaper and the synthesized memo persist to ./reports/workpapers/ on local disk - ParallelWebSearchTool as orchestrator-level quick lookup for ad-hoc cross-reference verification when contradictions surface Validated end-to-end on Rivian Automotive at the default core-fast processor: 14:36 wall-clock, 10 Task API calls (2 chained, 3 fan-out), 9 workpaper files persisted (167KB total) including a 33KB synthesized memo with TOC, executive summary, full per-section detail, inline source URLs, and risk severity tiering. The recipe ships with a runnable agent.py, a 15-cell walkthrough notebook, langgraph.json for langgraph dev, pyproject.toml + uv-based install, and the full Rivian sample output committed under reports/ so cookbook readers can preview the artifact shape. Registered in top-level README under "Deep Research & Notebooks" and website/cookbook.json.

First-cut draft of a launch blog post for the recipe. ~1500 words, engineering-blog tone, opens on the 'agent doesn't know what it doesn't know' failure mode, walks through the research_task wrapper + parse_basis + previous_interaction_id pattern, the three-phase orchestration with per-competitor fan-out, the FilesystemBackend virtual_mode gotcha, and the Rivian run results (cross-reference discrepancy resolution, JV-conflict finding, DOE loan correction). Saved as BLOG_DRAFT.md alongside the recipe so it stays paired with the code it describes.

Rewrote the blog draft following the parallel.ai cookbook-blog template (Tags / reading time / GitHub header) and the financial-services audience framing the team prefers. Opens on the broad set of FS workflows where DD shows up — bank credit, KYB/EDD, insurance underwriting, PE/VC, vendor risk, compliance/AML — rather than leading with a critique of 'most research agents.' Updated all code blocks to match the cookbook's actual implementation: - ParallelTaskRunTool + parse_basis (the SDK helper that the original draft pre-dated; the original drafted custom Basis-walking code) - core-fast processor (validated default) - FilesystemBackend(virtual_mode=True) for on-disk workpaper persistence - competitor-analysis fan-out subagent (the Phase-2 pattern that's the canonical Deep Agents move) - Validated run results (Rivian: 14 min, 10 calls, 33KB cited memo with funding-discrepancy resolution and JV-conflict finding) Expanded 'Who this is for' to lead with the FS verticals.

Each finding in the 'What the agent produced' section now links to the specific workpaper(s) that produced it. Top-of-post link to the synthesized memo and the workpapers directory; 'Run it yourself' section points readers at the synthesized memo and a Tesla competitor sample. Relative links resolve in the GitHub UI from the blog draft's location in the repo. They'll need rewriting when porting to the public blog renderer (separate task).

…eats, LC docs links Addresses findings from four parallel reviewers (technical accuracy, end-user clarity, Deep Agents showcase, Parallel showcase) plus two style alignment passes (parallel.ai/blog and langchain.com/blog). Accuracy fixes: - Replaced four fabricated block-quotes in 'What the agent produced' with verbatim findings from the actual workpapers: regulatory-credit dependency (financial-health), R1T tax-credit advantage (competitor- tesla), Mercedes 'technology-open' pivot (competitor-mercedes), TechCrunch 'previously unreported' disclosure-adequacy concern (litigation-regulatory), and the Confidence-and-Verification-Notes section with calibrated confidence ratings + named verification paths - Fixed FilesystemBackend description: virtual_mode=False doesn't silently fail — it writes to the wrong filesystem location - Fixed streaming snippet to match agent.py's v2 event-API shape - Fixed cost/latency framing: per-call latencies (15s-100s for core-fast, 30s-5min for pro-fast, 5-25min for ultra) per Parallel pricing docs, not the inflated per-run numbers we had Capability beats added: - Basis described as a per-field object with citations + reasoning + high/medium/low confidence (the differentiator vs document-level relevance scores from generic web search APIs) - previous_interaction_id explained: chains the prior research thread's source context, so 'verify the low-confidence field' doesn't restart - Phase-2 fan-out WHY: each subagent burns ~10-20K tokens of raw research material that doesn't pollute the orchestrator window - ParallelWebSearchTool's role sold: 1-3s, ~$0.005/call, ideal for cheap fact-check disambiguation during synthesis - Extensions section: FindAll for entity discovery, Monitor for post-deal surveillance, ParallelEnrichment for batch DD, Deep Agents primitives we don't exercise (interrupt_on, checkpointer, skills/memory) LangChain docs links threaded throughout per user request: - Deep Agents overview, planning (write_todos), subagents, filesystem, FilesystemBackend, harness primitives — each linked at first mention Style alignments: - 'Deep Agents is the harness, Parallel is the research substrate' framing (matches LangChain's 'Agent = Model + Harness' mental model) - Compressed Cost/Latency table to inline note (Parallel cookbook style) - Trimmed 'Who this is for' enumeration (replaced with extensions section that does similar work for engineers) - Resources block tightened to 4 grouped links from 7 flat ones Holding back for a future pass: hero image/diagram, Key Takeaways box at top, 'Why this architecture' restructure, code-block trimming.

…ch default to pro-fast Blog draft (BLOG_DRAFT.md): - Lead intro paragraph differentiates this recipe from canonical examples/deep_research (Tavily-based, generic prose) as the citation-grade vertical companion - Sharpened Basis differentiation: 'rather than a single document- level relevance score' - Dropped 'middleware' from the four-primitive list (overclaim — never exercised in the recipe); now lists three named primitives - Added 'Who this is for' callout under metadata block - Added plain-English intro paragraph before the 45-line research_task code block (audience C bounces here per clarity reviewer) - Tightened 'a couple of chained follow-ups' → explicit 'two' for auditable arithmetic (5 packed + 2 chained + 3 fan-out = 10) - competitive-landscape Phase-1 bullet now explicitly says it returns three competitor names that Phase-2 fans out on - Reinstated cost/latency table (small, 3 rows) for senior-eng spend evaluation; per-call latencies match Parallel pricing docs - Added Sonnet 4.6 model-selection rationale + how to swap - Added inline platform.parallel.ai link in Run-it section's .env comment (was previously only in Resources) Default processor switched core-fast → pro-fast across: - agent.py: research_task wrapper - README.md: prose, code snippet, cost table - due_diligence.ipynb: research_task code, latency annotations - BLOG_DRAFT.md: code snippet, validation prose, cost table Validation metrics for the pro-fast run will refresh once that run completes. The 'core-fast: 14 min, 10 calls' baseline is removed from the blog intro to avoid mismatch; Cost section keeps the generic call-count breakdown.

… and findings Re-validated end-to-end on Rivian at the new pro-fast default: - 23 minutes wall-clock (vs ~14 min on core-fast — pro-fast trades speed for deeper per-call reasoning) - 9 Task API calls (5 packed Phase-1 + 1 chained follow-up litigation-regulatory + 3 Phase-2 competitor-analysis) - 37KB synthesized memo (vs 33KB on core-fast) - 9 workpapers / ~189KB total persisted on disk Notable quality differences vs the prior core-fast run that the blog now leads with: (1) Quality-of-earnings finding: orchestrator synthesis caught that Rivian's first-ever FY2025 gross profit ($144M) was entirely funded by VW JV software/services revenue — automotive segment lost ~$432M at the gross level. core-fast had the regulatory-credit-dependency angle; pro-fast goes further to the JV-software-vs-auto-margin reality. (2) Open OSHA fatality investigation (Kevin Lancaster, March 5 2026, Normal IL) escalated as part of a documented pattern of prior OSHA serious citations — pro-fast's litigation-regulatory subagent caught this and chained a targeted OSHA-IMIS follow-up; core-fast didn't surface it. (3) Sharper competitor lineup — pro-fast picked Tesla, Ford, Kia (vs core-fast's Tesla, Ford, Mercedes). Kia EV9's 22,017 US sales in 2024 + North American Utility Vehicle of the Year award make it a more direct R1S three-row competitor than Mercedes (which is exiting the segment per its 'technology-open' pivot). Refreshed in the blog: - Validation metrics in the intro (9 calls, 23 min, 37KB memo) - Cost-and-latency call breakdown - All four 'What the agent produced' findings — replaced with new pro-fast-specific quotes drawn from the actual workpapers - Competitor list (Tesla, Ford, Kia) Workpapers committed in reports/workpapers/ for cookbook readers to preview without running the agent themselves.

Three reviewer-flagged items addressed: - Two stale '33 KB' memo references (header sub-line + FilesystemBackend paragraph) updated to '37 KB' to match the pro-fast Rivian memo - Resources block: Basis link corrected from /task-api/guides/basis to the canonical /task-api/guides/access-research-basis - Added a literal Basis example after the research_task code block — shows what citations_by_field, low_confidence_warning, and interaction_id look like when a real call comes back uncertain. The Parallel reviewer flagged that Basis was described but not shown; this converts it from claim to artifact in ~10 lines and demonstrates the chained-follow-up trigger concretely. All four re-reviewers (technical accuracy, end-user clarity, Deep Agents showcase, Parallel showcase) report the prior round of issues addressed. Remaining minor items deferred to the user's editorial pass: - Title still flat ('Building a company due diligence agent...') - Audience B forking guidance ('to repurpose for X, change these three things') — could be tightened - Production-hardening note for senior-eng audience (failure modes, retries, idempotency) — could be added under Extensions

…allout, attribute Deep Agents to LangChain - Metadata block (Tags / Reading time / GitHub / Sample output) now renders as bullets rather than collapsing into one paragraph in GitHub markdown view - Removed byte-size noise (37 KB / 189 KB) from validation paragraph and metadata — readers don't need that level of detail - Dropped the 'Who this is for' verticals enumeration callout. The opening paragraph already names the FS audience (PE / bank credit / compliance / insurance / vendor risk); the callout was off-voice per the Parallel cookbook style review and read as sales-deck rather than engineering-cookbook - 'Deep Agents is the harness' → 'LangChain's Deep Agents is the harness' for clearer attribution and to match LangChain's preferred framing

…ibling, not contrast The prior phrasing ('generic web search', 'open-ended prose report') read as an implicit knock against LangChain's canonical deep_research example. Reframed as 'sits alongside as a citation-grade vertical companion: same harness, swap the research substrate for Parallel's Task API' — same positioning, no disparagement of the sibling example.

Restructured against the parallel.ai cookbook house style — terse, section spine of setup → tools → subagents → orchestrator → run → stream → who this is for → resources. Trimmed: - 'Substrate / harness' framing replaced with plain 'combining Deep Agents for orchestration and Parallel's Task API for web research' - Phase 1/2/3 ceremony collapsed into one paragraph in Overview - 'What the agent produced' results section removed (Sample Output link at top handles that need; readers can click through to the memo) - Cost/latency tier table removed (one-line pointer to pricing page) - Extensions section (FindAll/Monitor/Enrichment/ultra) removed - Sonnet 4.6 selection rationale paragraph removed - Literal Basis snippet (>>> result = ...) removed - FilesystemBackend(virtual_mode=True) compressed from a 4-paragraph warning to one sentence in the orchestrator section - Multi-paragraph virtual_mode/StateBackend explanation gone Preserved: - ParallelTaskRunTool + parse_basis (the SDK helpers; user's earlier draft pre-dated them) - FilesystemBackend(virtual_mode=True) one-line mention so readers know workpapers persist to disk - Phase-2 competitor-analysis fan-out subagent (the canonical Deep Agents pattern; single paragraph in Overview, one code block in Implementation) - pro-fast as default - Validated 9 calls / ~23 min on Rivian - Streaming snippet matching agent.py's v2 event API Result: ~2400 words → ~1500 words. Reads as a cookbook, not a tutorial.

…older Keeps the recipe directory focused on user-facing files. The unpublished BLOG_DRAFT.md and Observability-with-LangSmith.md drafts now live under drafts/ for reference.

Adds a callout under the lead and a Resources entry pointing to the LangChain blog post that announces this recipe.

The PNGs are only referenced from the draft writeups; keeping them under drafts/assets/ preserves the existing relative paths in BLOG_DRAFT.md and Observability-with-LangSmith.md.

NormallyGaussian added 16 commits April 28, 2026 18:35

docs(deepagents-due-diligence): drop reading-time from metadata block

41907be

LangSmith section

306f5f6

docs(deepagents-due-diligence): move draft writeups into drafts/ subf…

0f75521

…older Keeps the recipe directory focused on user-facing files. The unpublished BLOG_DRAFT.md and Observability-with-LangSmith.md drafts now live under drafts/ for reference.

docs(deepagents-due-diligence): link to official LangChain blog post

42f87c3

Adds a callout under the lead and a Resources entry pointing to the LangChain blog post that announces this recipe.

docs(deepagents-due-diligence): move blog assets into drafts/

c7a9927

The PNGs are only referenced from the draft writeups; keeping them under drafts/assets/ preserves the existing relative paths in BLOG_DRAFT.md and Observability-with-LangSmith.md.

NormallyGaussian marked this pull request as ready for review May 7, 2026 03:42

NormallyGaussian requested review from khushishelat and sergei1152 May 7, 2026 03:42

sergei1152 approved these changes May 7, 2026

View reviewed changes

NormallyGaussian merged commit 2021188 into main May 7, 2026
1 check passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(python-recipes): Due Diligence Agent on Deep Agents + Parallel#29

feat(python-recipes): Due Diligence Agent on Deep Agents + Parallel#29
NormallyGaussian merged 16 commits intomainfrom
mh/deepagents-due-diligence

NormallyGaussian commented Apr 28, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

NormallyGaussian commented Apr 28, 2026

Summary

What it shows

Validated run

Files

Registration

Status

Test plan

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants