Skip to content

Improve Speaker Identification docs: clearer overview, decision guide, and page split#778

Merged
dylan-duan-aai merged 12 commits intomainfrom
devin/1773776058-speaker-id-docs-improvements
Mar 18, 2026
Merged

Improve Speaker Identification docs: clearer overview, decision guide, and page split#778
dylan-duan-aai merged 12 commits intomainfrom
devin/1773776058-speaker-id-docs-improvements

Conversation

@devin-ai-integration
Copy link
Contributor

@devin-ai-integration devin-ai-integration bot commented Mar 17, 2026

Summary

Documentation-only changes to the Speaker Identification page (fern/pages/speech-understanding/speech-understanding.mdx) to improve clarity and scannability, plus a structural refactor to reduce code duplication by splitting Method 1 and Method 2 into separate pages.

Clarity improvements (commit 1)

  1. Overview rewrite: Leads with the value proposition ("Replace generic labels with real names or roles, no voice enrollment needed") instead of a feature description.
  2. Prerequisite callout upgrade: Changed the Speaker Diarization dependency from a <Note> to a <Warning> with an explicit speaker_labels: true instruction, making it harder to miss.
  3. known_values vs speakers surfaced earlier: Added a new "Choosing how to identify speakers" subsection in the Overview that explains the two approaches upfront, rather than burying the distinction in the Advanced Usage section.
  4. Decision guide: Added a quick "name vs role" and "when to use speakers with descriptions" guide so users can pick the right approach at a glance.

Page split (commit 2)

  1. Main page now shows only Method 1 (transcribe + identify in one request). All Method 2 code examples, curl snippets, and dual-method Advanced Usage snippets have been removed from the main page.
  2. New page created: speaker-identification-existing-transcript.mdx — "Using Speaker Identification on an existing transcript" — contains the Method 2 full examples (Python + JS), its own Advanced Usage section, and the Method 2 API reference (curl).
  3. Cross-links added: The main page links to the new page via a <Note> callout in the "How to use" section. The new page links back to the main page for the shared request parameters and response reference tables.
  4. docs.yml updated: Speaker Identification changed from a page to a section with the new page as a child, registered at slug speaker-identification-existing-transcript.

Content refinements (commits 3–5)

  1. "Output format details" section removed from the main page — it was redundant with the before/after already shown in the Overview.
  2. "Identify by name" and "Identify by role" subsections added under "How to use" on both pages — each has a brief description and full Python + JS tabbed examples. Previously the code examples had no name/role subsection headings, and role-based identification was only covered in Advanced Usage.
  3. "Advanced usage" renamed to "Adding speaker metadata" on both pages — promoted to a ## heading, with the former ### sub-heading ("Adding speaker metadata with speakers") removed. The #### Simple usage and #### Advanced usage sub-headings within are also flattened into flowing prose, making the section more scannable.
  4. Common role combinations list added after the "Identify by role" examples on both pages.
  5. "Choosing how to identify speakers" simplified — removed the known_values vs speakers intro paragraph and bullet points. The section now leads directly with the name/role decision bullets, each with a "Click here to learn more." anchor link to the corresponding #identify-by-name / #identify-by-role / #adding-speaker-metadata section.
  6. API reference callout removed — the <Note> linking to the existing-transcript page from the API reference section was removed (the cross-link in the "How to use" section remains).

Polish pass (commits 6–7)

  1. Warning/Note split on main page: The <Warning> box previously combined the hard prerequisite (speaker_labels: true required) with an audio quality tip. These are now separated — <Warning> for the requirement, <Note> for the quality guidance — so the critical prerequisite stands out.
  2. "Python for brevity" note added to the "Adding speaker metadata" section on both pages, since those examples are Python-only while the rest of the page has Python + JS tabs.
  3. Role-based custom properties code block replaced with a sentence on both pages: "You can use the same custom properties with role-based identification by replacing name with role in each speaker object." — eliminates a low-value code snippet that only demonstrated a narrow point.
  4. "How to use" intro simplified on main page: "to transcribe and identify speakers in a single step" → "to identify speakers" — the "single step" comparison was a leftover from when both methods lived on the same page.
  5. Response JSON truncated in the API reference on the main page: reduced from two full utterances (~56 lines) to one utterance with truncated text and a // ... more utterances comment. The response fields table still documents every field.
  6. "Key differences from standard transcription" table replaced with a sentence: "With Speaker Identification, the speaker field in utterances and words contains the identified name or role instead of generic labels like "A", "B", "C"." The two-row table was redundant given how little information it conveyed.

Final polish (commit 8)

  1. Request parameters table shortened on main page: removed the three container-object rows (speech_understanding, .request, .speaker_identification) and dropped the speaker_identification. prefix from the remaining keys. An introductory sentence ("The following parameters are nested under speech_understanding.request.speaker_identification:") replaces them. This gives the Description column substantially more horizontal space.
  2. Redundant speakers vs known_values <Note> removed from both pages — this distinction was already covered in the "Choosing" decision guide, the "Adding speaker metadata" intro paragraph, and the request parameters table descriptions.
  3. "Choosing how to identify speakers" section added to the sub page — previously only on the main page. Users landing directly on the sub page now get the same name/role/metadata decision guide with anchor links.
  4. Sub page Overview tightened: "This is useful for more complex workflows…" → "This is especially useful when you want to re-identify speakers with different parameters, or when your workflow separates transcription from post-processing." Leads with the stronger use case.
  5. Role-based Before/After example added to the Overview on the main page — a second "After (by role)" block showing Speaker AInterviewer, reinforcing that roles are a first-class option.
  6. "default approach" → "most common approach" in "Identify by name" description on both pages — avoids implying speaker_type: "name" is a default parameter value (the parameter is required).

Review & Testing Checklist for Human

  • Verify anchor links on both pages: The "Click here to learn more." links point to #identify-by-name, #identify-by-role, and #adding-speaker-metadata. These anchors now exist on both the main page and the sub page. Confirm Fern generates matching slugs from the headings on each page — if Fern slugifies differently, these links will be broken.
  • Verify the shortened Request parameters table renders correctly: The table now uses short keys like speaker_type, known_values, speakers[].<custom>. Confirm the table renders with better proportions (wider Description column) and that the introductory sentence about the nesting path is visible.
  • Verify the section change in docs.yml: Speaker Identification changed from page to section. Confirm sidebar navigation, URL routing (/docs/speech-understanding/speaker-identification), and TOC still work as expected.
  • Check cross-page links resolve correctly: Main page → /docs/speech-understanding/speaker-identification-existing-transcript; sub page → /docs/speech-understanding/speaker-identification#request-parameters and #response.
  • Verify the role-based Before/After renders cleanly: The Overview now has three code blocks in sequence (Before, After by name, After by role). Confirm they don't visually blend together or look cluttered.

Recommended test plan: Open the deploy preview → navigate to Speaker Identification → verify the Before/After shows both name and role examples → verify <Warning> and <Note> render as separate boxes → click all three "Click here to learn more." anchor links → scroll to the Request parameters table and confirm the shorter keys render with a wider Description column → scroll to the Response JSON and confirm the truncated example renders cleanly → click the "existing transcript" link → on the sub page, verify the "Choosing how to identify speakers" section exists with working anchor links → verify "most common approach" wording in "Identify by name" → verify the speakers vs known_values <Note> is gone from both pages.

Notes

  • Both pages have hidden: true in frontmatter, consistent with the existing pattern.
  • The new page's API reference defers to the main page for the full request parameters table and response fields table to avoid duplication.
  • The old #advanced-usage anchor is replaced by #adding-speaker-metadata — this is a breaking change for any existing external links targeting that anchor.
  • Pre-existing lint warnings are unrelated (unused OpenAPI components).

Link to Devin session: https://app.devin.ai/sessions/73e913af2ee5457797441017325f14d7
Requested by: @LeeVaughn


Open with Devin

…allout, decision guide

- Rewrite overview to lead with the value prop (replace generic labels with real names/roles, no voice enrollment needed)
- Upgrade Speaker Diarization prerequisite from Note to Warning with explicit speaker_labels: true instruction
- Add 'Choosing how to identify speakers' section surfacing known_values vs speakers choice earlier
- Add decision guide for name vs role identification and when to use speakers with descriptions

Co-Authored-By: Lee Vaughn <dlvprogramming@gmail.com>
@devin-ai-integration
Copy link
Contributor Author

🤖 Devin AI Engineer

I'll be helping with this pull request! Here's what you should know:

✅ I will automatically:

  • Address comments on this PR. Add '(aside)' to your comment to have me ignore it.
  • Look at CI failures and help fix them

Note: I can only respond to comments from users who have write access to this repository.

⚙️ Control Options:

  • Disable automatic comment and CI monitoring

@github-actions
Copy link

… existing transcript page (Method 2)

Co-Authored-By: Lee Vaughn <dlvprogramming@gmail.com>
@devin-ai-integration devin-ai-integration bot changed the title Improve Speaker Identification docs: clearer overview, prerequisite callout, decision guide Improve Speaker Identification docs: clearer overview, decision guide, and page split Mar 17, 2026
@github-actions
Copy link

…ls, slim Advanced Usage

Co-Authored-By: Lee Vaughn <dlvprogramming@gmail.com>
@github-actions
Copy link

…aker metadata'

Co-Authored-By: Lee Vaughn <dlvprogramming@gmail.com>
@github-actions
Copy link

…lout

Co-Authored-By: Lee Vaughn <dlvprogramming@gmail.com>
@github-actions
Copy link

Co-Authored-By: Lee Vaughn <dlvprogramming@gmail.com>
@github-actions
Copy link

…lit Warning/Note, simplify intro

Co-Authored-By: Lee Vaughn <dlvprogramming@gmail.com>
@github-actions
Copy link

Co-Authored-By: Lee Vaughn <dlvprogramming@gmail.com>
@github-actions
Copy link

…o sub page, remove redundant Note

Co-Authored-By: Lee Vaughn <dlvprogramming@gmail.com>
@github-actions
Copy link

@github-actions
Copy link

@github-actions
Copy link

Copy link
Contributor Author

@devin-ai-integration devin-ai-integration bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✅ Devin Review: No Issues Found

Devin Review analyzed this PR and found no potential bugs to report.

View in Devin Review to see 2 additional findings.

Open in Devin Review

@dylan-duan-aai dylan-duan-aai self-requested a review March 18, 2026 15:11
@github-actions
Copy link

@dylan-duan-aai dylan-duan-aai merged commit 94ae60d into main Mar 18, 2026
4 of 5 checks passed
@dylan-duan-aai dylan-duan-aai deleted the devin/1773776058-speaker-id-docs-improvements branch March 18, 2026 15:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants