Skip to content

Commit ae4945f

Browse files
vercel-ai-sdk[bot]dpmishlergr2m
authored
Backport: feat(deepgram): add text-to-speech support (#10429)
This is an automated backport of #10063 to the release-v5.0 branch. --------- Co-authored-by: dpmishler <122304480+dpmishler@users.noreply.github.com> Co-authored-by: Gregor Martynus <39992+gr2m@users.noreply.github.com>
1 parent c619f91 commit ae4945f

File tree

11 files changed

+929
-14
lines changed

11 files changed

+929
-14
lines changed

.changeset/feat-deepgram-tts.md

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,7 @@
1+
---
2+
'@ai-sdk/deepgram': patch
3+
---
4+
5+
feat(deepgram): add text-to-speech support
6+
7+
Add text-to-speech support for Deepgram Aura models via the REST API. Supports all Aura voice models (aura-2-helena-en, aura-2-thalia-en, etc.) with proper audio format validation, encoding/container/sample_rate/bitrate combinations, and comprehensive parameter validation.
Lines changed: 20 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,20 @@
1+
import { deepgram } from '@ai-sdk/deepgram';
2+
import { experimental_generateSpeech as generateSpeech } from 'ai';
3+
import 'dotenv/config';
4+
import { saveAudioFile } from '../lib/save-audio';
5+
6+
async function main() {
7+
const result = await generateSpeech({
8+
model: deepgram.speech('aura-2-helena-en'),
9+
text: 'Hello, welcome to Deepgram! This is a test of the text-to-speech API.',
10+
});
11+
12+
console.log('Audio:', result.audio);
13+
console.log('Warnings:', result.warnings);
14+
console.log('Responses:', result.responses);
15+
console.log('Provider Metadata:', result.providerMetadata);
16+
17+
await saveAudioFile(result.audio);
18+
}
19+
20+
main().catch(console.error);

packages/deepgram/README.md

Lines changed: 16 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
# AI SDK - Deepgram Provider
22

33
The **[Deepgram provider](https://ai-sdk.dev/providers/ai-sdk-providers/deepgram)** for the [AI SDK](https://ai-sdk.dev/docs)
4-
contains transcription model support for the Deepgram transcription API.
4+
contains transcription model support for the Deepgram transcription API and speech model support for the Deepgram text-to-speech API.
55

66
## Setup
77

@@ -19,7 +19,9 @@ You can import the default provider instance `deepgram` from `@ai-sdk/deepgram`:
1919
import { deepgram } from '@ai-sdk/deepgram';
2020
```
2121

22-
## Example
22+
## Examples
23+
24+
### Transcription
2325

2426
```ts
2527
import { deepgram } from '@ai-sdk/deepgram';
@@ -33,6 +35,18 @@ const { text } = await transcribe({
3335
});
3436
```
3537

38+
### Text-to-Speech
39+
40+
```ts
41+
import { deepgram } from '@ai-sdk/deepgram';
42+
import { experimental_generateSpeech as generateSpeech } from 'ai';
43+
44+
const { audio } = await generateSpeech({
45+
model: deepgram.speech('aura-2-helena-en'),
46+
text: 'Hello, welcome to Deepgram!',
47+
});
48+
```
49+
3650
## Documentation
3751

3852
Please check out the **[Deepgram provider documentation](https://ai-sdk.dev/providers/ai-sdk-providers/deepgram)** for more information.

packages/deepgram/src/deepgram-provider.ts

Lines changed: 18 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,6 @@
11
import {
22
TranscriptionModelV2,
3+
SpeechModelV2,
34
ProviderV2,
45
NoSuchModelError,
56
} from '@ai-sdk/provider';
@@ -10,6 +11,8 @@ import {
1011
} from '@ai-sdk/provider-utils';
1112
import { DeepgramTranscriptionModel } from './deepgram-transcription-model';
1213
import { DeepgramTranscriptionModelId } from './deepgram-transcription-options';
14+
import { DeepgramSpeechModel } from './deepgram-speech-model';
15+
import { DeepgramSpeechModelId } from './deepgram-speech-options';
1316
import { VERSION } from './version';
1417

1518
export interface DeepgramProvider extends ProviderV2 {
@@ -24,6 +27,11 @@ export interface DeepgramProvider extends ProviderV2 {
2427
Creates a model for transcription.
2528
*/
2629
transcription(modelId: DeepgramTranscriptionModelId): TranscriptionModelV2;
30+
31+
/**
32+
Creates a model for speech generation.
33+
*/
34+
speech(modelId: DeepgramSpeechModelId): SpeechModelV2;
2735
}
2836

2937
export interface DeepgramProviderSettings {
@@ -71,6 +79,14 @@ export function createDeepgram(
7179
fetch: options.fetch,
7280
});
7381

82+
const createSpeechModel = (modelId: DeepgramSpeechModelId) =>
83+
new DeepgramSpeechModel(modelId, {
84+
provider: `deepgram.speech`,
85+
url: ({ path }) => `https://api.deepgram.com${path}`,
86+
headers: getHeaders,
87+
fetch: options.fetch,
88+
});
89+
7490
const provider = function (modelId: DeepgramTranscriptionModelId) {
7591
return {
7692
transcription: createTranscriptionModel(modelId),
@@ -79,6 +95,8 @@ export function createDeepgram(
7995

8096
provider.transcription = createTranscriptionModel;
8197
provider.transcriptionModel = createTranscriptionModel;
98+
provider.speech = createSpeechModel;
99+
provider.speechModel = createSpeechModel;
82100

83101
// Required ProviderV2 methods that are not supported
84102
provider.languageModel = () => {
Lines changed: 15 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,15 @@
1+
export type DeepgramSpeechAPITypes = {
2+
// Request body
3+
text: string;
4+
5+
// Query parameters (these are set via query params, not body)
6+
model?: string;
7+
encoding?: string;
8+
sample_rate?: number;
9+
bit_rate?: number | string;
10+
container?: string;
11+
callback?: string;
12+
callback_method?: 'POST' | 'PUT';
13+
mip_opt_out?: boolean;
14+
tag?: string | string[];
15+
};

0 commit comments

Comments
 (0)