Skip to content

Conversation

@patrickloeber
Copy link
Contributor

This PR adds a new document loader AssemblyAIAudioTranscriptLoader that allows to transcribe audio files with the AssemblyAI API and loads the transcribed text into documents.

  • Add new document_loader with class AssemblyAIAudioTranscriptLoader
  • Add optional dependency assemblyai
  • Add unit tests (using a Mock client)
  • Add docs notebook

This is the equivalent to the JS integration already available in LangChain.js. See the LangChain JS docs AssemblyAI page.

At its simplest, you can use the loader to get a transcript back from an audio file like this:

from langchain.document_loaders.assemblyai import AssemblyAIAudioTranscriptLoader

loader =  AssemblyAIAudioTranscriptLoader(file_path="./testfile.mp3")
docs = loader.load()

To use it, it needs the assemblyai python package installed, and the
environment variable ASSEMBLYAI_API_KEY set with your API key. Alternatively, the API key can also be passed as an argument.

Twitter handles to shout out if so kindly 🙇
@AssemblyAI and @patloeber

@vercel
Copy link

vercel bot commented Aug 23, 2023

The latest updates on your projects. Learn more about Vercel for Git ↗︎

1 Ignored Deployment
Name Status Preview Comments Updated (UTC)
langchain ⬜️ Ignored (Inspect) Visit Preview Aug 24, 2023 5:36am

- Add new class `AssemblyAIAudioTranscriptLoader`
- Add optional dependency `assemblyai`
- Add unit tests (using a Mock client)
- Add docs notebook

The `AssemblyAIAudioTranscriptLoader` allows to transcribe audio files
with the AssemblyAI API and loads the transcribed text into documents.
@patrickloeber patrickloeber force-pushed the add-assemblyai-audio-transcript-loader branch from 712cbdb to d917776 Compare August 23, 2023 21:19
@baskaryan baskaryan requested a review from eyurtsev August 23, 2023 21:28
Copy link
Collaborator

@eyurtsev eyurtsev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@baskaryan looks good to me feel free to merge

Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
@baskaryan baskaryan merged commit 5990651 into langchain-ai:master Aug 24, 2023
baskaryan pushed a commit that referenced this pull request Aug 24, 2023
…9687)

Uses the shorter import path

`from langchain.document_loaders import` instead of the full path
`from langchain.document_loaders.assemblyai`

Applies those changes to the docs and the unit test.

See #9667 that adds this new loader.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants