We consume RSS feeds from multiple podcast sources. When parsing transcripts from a PODSTR-powered feed, we noticed the <podcast:transcript> type attribute is always text/plain, even when the linked file is SRT format. This causes our transcript pipeline to miss the SRT structure — timestamps and cue markers end up as noise in the extracted text, which degrades the quality of downstream summaries and analysis built on top of it.
Details
scripts/build-rss.ts hardcodes the type:
${transcriptUrl ? `<podcast:transcript url="${escapeXml(transcriptUrl)}" type="text/plain" />` : ''}
The Podcasting 2.0 transcript spec defines type as a required attribute. Podcast apps and validators use it to interpret transcript format.
Why this happens
The Nostr event tag stores ['transcript', url] without a MIME type, so build-rss.ts has no type information at RSS generation time.
Suggested fix
At upload time, infer the MIME type from the file extension (since browsers' File.type is unreliable for formats like .srt that aren't in the IANA registry). Pass it as an optional third element in the Nostr tag:
// usePublishEpisode.ts — infer MIME from extension (File.type is unreliable for .srt)
function inferTranscriptMime(filename: string, fileType: string): string {
const ext = filename.split('.').pop()?.toLowerCase();
const mimeMap: Record<string, string> = {
srt: 'application/x-subrip',
vtt: 'text/vtt',
json: 'application/json',
html: 'text/html',
txt: 'text/plain',
};
return mimeMap[ext ?? ''] || fileType || 'text/plain';
}
tags.push(['transcript', transcriptUrl, inferTranscriptMime(transcriptFile.name, transcriptFile.type)]);
Then in build-rss.ts, read the optional third element:
const transcriptType = tags.get('transcript')?.[1] || 'text/plain';
Backward-compatible: existing 2-element ['transcript', url] tags continue defaulting to text/plain with no behavior change.
Reproduction
curl -sI "https://blossom.primal.net/82daa00294af2bda132885feef9085c5daeb265c09ad15f9f8e0e65c5dbf8520" | grep -i content-type
# → application/x-subrip
curl -s "https://podcast.nostrcompass.org/rss.xml" | grep "podcast:transcript"
# → type="text/plain" (expected: application/x-subrip)
I'm opening a PR with this fix.
We consume RSS feeds from multiple podcast sources. When parsing transcripts from a PODSTR-powered feed, we noticed the
<podcast:transcript>typeattribute is alwaystext/plain, even when the linked file is SRT format. This causes our transcript pipeline to miss the SRT structure — timestamps and cue markers end up as noise in the extracted text, which degrades the quality of downstream summaries and analysis built on top of it.Details
scripts/build-rss.tshardcodes the type:The Podcasting 2.0 transcript spec defines
typeas a required attribute. Podcast apps and validators use it to interpret transcript format.Why this happens
The Nostr event tag stores
['transcript', url]without a MIME type, sobuild-rss.tshas no type information at RSS generation time.Suggested fix
At upload time, infer the MIME type from the file extension (since browsers'
File.typeis unreliable for formats like.srtthat aren't in the IANA registry). Pass it as an optional third element in the Nostr tag:Then in
build-rss.ts, read the optional third element:Backward-compatible: existing 2-element
['transcript', url]tags continue defaulting totext/plainwith no behavior change.Reproduction
I'm opening a PR with this fix.