- Update on 2025-03-22: You can try out OpenAI's gpt-4o-transcribe and gpt-4o-mini-transcribe models as the audio transcription model, however I'm not supporting it now.
A tool that converts course playbacks into comprehensive notes or cleaned transcripts.
- Extract soundtracks from course playbacks.
- Convert soundtracks into text.
- Create a comprehensive summary or clean transcript from the original text. (At times, the notes generated by language models may not seem reliable, so I use them to produce a cleaner version of the transcripts instead.)
- Clone the repository.
git clone git@github.com:EmptyBlueBox/Course2Note.git
cd Course2Note
- Configure the
config_private.toml
file.
Copy the config.toml
file to config_private.toml
and configure the 2 API keys:
Configure the COURSE_NAME
and STYLE
in the config_private.toml
file.
- To generate a note, set
STYLE
tonote
. - To generate a cleaned transcript, set
STYLE
tocleaner
.
- Put your course playbacks and slides in the
Input
folder.
The project folder should have the following structure:
.
├── Course
│ └── <Course Name>
│ └── Playback
│ ├── 1.mp4
│ ├── 2.mp4
│ ├── 3.mp4
│ └── 4.mp4
├── LICENSE
├── README.md
└── Workfolder
- Install the dependencies.
Create a conda environment and install the python dependencies:
conda create -n course2note python=3.10
conda activate course2note
pip install -r requirements.txt
- Run the script.
python main.py
- XunFei API (for audio transcription)
- OpenAI ChatGPT API (for note generation from transcripts and slides)
- DeepSeek API (for note generation from transcripts and slides)
- The context length of OpenAI API is limited.
- The info in slides is complicated and not structured.
- The OCR results are messy, but if use copy and paste, there will be a lot of information loss.
So I drop the info in slides and only keep the info in the audio, and wish you can read the slides for better understanding.