Skip to content

[requesting help] Audio speech-to-text using AWS Transcription. #1084

Open
@foragerr

Description

@foragerr

Describe the bug
I'm trying to use AWS Transcription instead of Whisper as shown in this example. But I keep getting back empty text. I suspect I'm getting the encoding or sample rate wrong and I wasn't able to find much info the properties of captured audio in chainlit docs.

To Reproduce
Riffing off this example, I'm doing:

class TranscriptionEventHandler(TranscriptResultStreamHandler):
    async def handle_transcript_event(self, transcript_event: TranscriptEvent):
        print("Transcript event fired")
        results = transcript_event.transcript.results
        for result in results:
            for alt in result.alternatives:
                print(alt.transcript)
                
@cl.on_audio_chunk
async def on_audio_chunk(chunk: cl.AudioChunk):
    if not cl.user_session.get("stream"):
        # transcription init
        transcribe_client = TranscribeStreamingClient(region="us-west-2")
        stream = await transcribe_client.start_stream_transcription(
            language_code="en-US",
            media_sample_rate_hz=44100,
            media_encoding="pcm",
        )
        cl.user_session.set("stream", stream)
        handler = TranscriptionEventHandler(stream.output_stream)
        cl.user_session.set("handler", handler)
        await handler.handle_events()

    print("CHUNK FIRED")
    stream = cl.user_session.get("stream")
    await stream.input_stream.send_audio_event(audio_chunk=chunk.data)


@cl.on_audio_end
async def on_audio_end(elements: list[ElementBased]):
    print("END FIRED")
    stream = cl.user_session.get("stream")
    await stream.input_stream.end_stream()

Expected behavior
AWS transcription returns text transcription of captured audio.

Additional context
I'm looking for an extra pair of eyes essentially, maybe some pointers on what direction to go, to troubleshoot, or debugging tips.
I reckon if I can get this working, it'd be a useful addition to the cookbooks?

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions