NotesMaker is a powerful tool designed to automate the process of creating meeting minutes from video call transcripts. Utilizing advanced natural language processing capabilities, NoteMaker reads in VTT (Web Video Text Tracks) files, cleans up the transcripts by removing filler words, and summarizes the content. These summaries are then written to text files, making it easy to capture the essence of meetings without manual effort.
- Transcript Cleanup: Removes common filler words and non-essential parts of speech to clean up the transcript.
- Automatic Summarization: Uses OpenAI's natural language processing to generate concise summaries of each transcript.
- Minutes Generation: Writes the summaries into neatly formatted minutes in text files for easy review and distribution.
- Python 3.8 or later
- An Azure account with an OpenAI resource created
.env
file with necessary API keys and endpoints
-
Clone the repository
git clone https://github.com/jacklatrobe/NotesMaker.git cd NoteMaker
-
Install dependencies
pip install -r requirements.txt
-
Set up environment variables Create a
.notesmaker.env
file in the root directory with the following content, replacing placeholders with your actual Azure OpenAI resource values:API_VERSION=<YOUR_API_VERSION> BASE_URL=<YOUR_BASE_URL> API_KEY=<YOUR_API_KEY> DEPLOYMENT_NAME=<DEPLOYMENT_NAME> CHUNK_SIZE=7500
-
Prepare your transcripts Place your
.vtt
transcript files in the./transcripts
directory. -
Run NoteMaker Execute the main script to process the transcripts and generate minutes:
python notemaker.py
Processed minutes will be saved in the
./minutes
directory, with each summary corresponding to its source transcript file.
API_VERSION
,BASE_URL
,API_KEY
,DEPLOYMENT_NAME
: Configure these variables in your.notesmaker.env
file to match your Azure OpenAI setup.CHUNK_SIZE
: Adjust the chunk size for processing large transcripts. The default is set to 7500 but can be modified based on the size of your transcripts and the desired level of detail in summaries - this will primarily be limited by the maximum token model available to you.- A note on LLMs - this is built to use Azure OpenAI, but porting this to use another LLM via LangChain's wrapper should be quickly possible.
Distributed under the MIT License.
Jack Latrobe - [https://latrobe.group/](Latrobe Consulting Group)
- Email: jack@latrobe.group
- Project Link: https://github.com/jacklatrobe/NotesMaker