Allow multiple transcript rows for the same Asterisk uniqueid by Stell0 · Pull Request #47 · nethesis/satellite

Stell0 · 2026-04-30T07:06:19Z

Summary

This change removes the UNIQUE constraint from transcripts.uniqueid and updates the persistence flow to track each stored transcript by its internal id.

Why

With recent ns8-nethvoice changes for transferred calls (nethesis/ns8-nethvoice#803), a single Asterisk call can produce multiple recording fragments that share the same uniqueid but belong to different call legs. The previous schema and write path assumed one row per uniqueid, so Satellite would either overwrite fragments through ON CONFLICT (uniqueid) or lose correct state tracking.

What changed

transcripts.uniqueid is now non-unique and remains indexed.
Startup schema bootstrap removes the legacy unique constraint from existing databases.
POST /api/get_transcription creates one transcript row per persisted request.
Raw transcript and state transitions are now updated by transcript_id, not by uniqueid.
Tests and README were updated.

Impact

This preserves all transferred-call fragments, prevents transcript/state corruption when multiple uploads share the same uniqueid, and keeps AI enrichment tied to the correct stored fragment. No HTTP or MQTT payloads changed.

Stell0 · 2026-04-30T07:07:19Z

nethcti-middleware Regressions And Fixes
Reviewed on branch fix_transcriptions at commit e4b4631.

Deterministic read regression: transcription and summary helpers still use unordered LIMIT 1, so duplicate transcript rows can return an arbitrary fragment. Fix by selecting one canonical row per uniqueid, preferably the latest non-deleted row ordered by updated_at DESC, id DESC, unless product explicitly wants aggregation.
Duplicate list regression: summary/status list logic can return multiple entries for the same uniqueid because it reads every matching transcript row. Fix with a canonical-row CTE/subquery or DISTINCT ON (uniqueid).
Update fan-out regression: manual summary updates currently affect every row with the same uniqueid. Fix by updating only the canonical row.
Delete fan-out regression: summary deletion currently marks every fragment deleted. Fix by deleting only the canonical row.
Watch/HEAD regression: summary state checks and watcher logic can observe the wrong row and misreport done, failed, or missing summary/transcription. Fix by applying the same canonical-row rule used elsewhere.
Caller/callee metadata regression: transferred-call CDR metadata still uses single-row selection and can display the wrong leg. Fix with a deterministic metadata selection rule, or merge data across matching CDR rows if that is the intended UX.
Missing coverage: add tests with at least two transcript rows sharing the same uniqueid and verify deterministic reads, deduplicated list output, and non-fan-out update/delete behavior.
No change needed for authorization: participation checks already iterate all matching CDR rows, so they are not relying on a single-row assumption.

Copilot

Pull request overview

This PR updates the transcription persistence flow so Satellite can store multiple transcript fragments that share the same Asterisk uniqueid, which is needed for transferred/multi-leg calls after the related ns8-nethvoice changes.

Changes:

Removes the database uniqueness assumption on transcripts.uniqueid and migrates existing schemas to a non-unique indexed column.
Switches persistence/state tracking to use the transcript row’s internal id, while also storing optional linkedid, src_number, and dst_number.
Expands tests and README coverage for the new persistence behavior, including silent-audio handling.

Reviewed changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated 2 comments.

Show a summary per file

File	Description
`db.py`	Updates schema bootstrap/migration and changes transcript persistence/state helpers to work with row IDs instead of assuming `uniqueid` uniqueness.
`api.py`	Passes new participant fields through the transcription endpoint, initializes a transcript row up front, and adds empty-audio handling.
`tests/test_db.py`	Adds coverage for schema migration, new participant columns, insert/update-by-id behavior, and latest-row state updates.
`tests/test_api.py`	Adds endpoint tests for persisted transcript IDs, linked/participant field forwarding, and silent-audio success handling.
`README.md`	Documents the non-unique `uniqueid` model, new optional persisted fields, and the silent-audio behavior.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>

fix(db): allow multiple lines with same uniqueid

023d9b8

Stell0 added 2 commits April 30, 2026 16:14

feat(db): add src_number and dst_number to database

4c936e1

fix(get_transcript): add linkedid in post params

02e4acc

Stell0 requested a review from Copilot May 4, 2026 08:13

Copilot started reviewing on behalf of Stell0 May 4, 2026 08:14 View session

Copilot AI reviewed May 4, 2026

View reviewed changes

Comment thread api.py Outdated

Comment thread db.py Outdated

Potential fix for pull request finding

a3f5fab

Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>

tommaso-ascani changed the base branch from main to fix_transcriptions May 4, 2026 08:20

Stell0 and others added 3 commits May 4, 2026 10:20

Potential fix for pull request finding

24fa7ab

Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>

Merge branch 'fix_transcriptions' into nounique

2f29931

fix(test)

dcfde99

tommaso-ascani merged commit bc11c48 into fix_transcriptions May 4, 2026
5 checks passed

tommaso-ascani deleted the nounique branch May 4, 2026 09:29

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Allow multiple transcript rows for the same Asterisk uniqueid#47

Allow multiple transcript rows for the same Asterisk uniqueid#47
tommaso-ascani merged 7 commits intofix_transcriptionsfrom
nounique

Stell0 commented Apr 30, 2026

Uh oh!

Stell0 commented Apr 30, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

Stell0 commented Apr 30, 2026

Summary

Why

What changed

Impact

Uh oh!

Stell0 commented Apr 30, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants