Course2Note

Update on 2025-03-22: You can try out OpenAI's gpt-4o-transcribe and gpt-4o-mini-transcribe models as the audio transcription model, however I'm not supporting it now.

Brief

A tool that converts course playbacks into comprehensive notes or cleaned transcripts.

Features

Extract soundtracks from course playbacks.
Convert soundtracks into text.
Create a comprehensive summary or clean transcript from the original text. (At times, the notes generated by language models may not seem reliable, so I use them to produce a cleaner version of the transcripts instead.)

Usage

Clone the repository.

git clone git@github.com:EmptyBlueBox/Course2Note.git
cd Course2Note

Configure the config_private.toml file.

Copy the config.toml file to config_private.toml and configure the 2 API keys:

Configure the COURSE_NAME and STYLE in the config_private.toml file.

To generate a note, set STYLE to note.
To generate a cleaned transcript, set STYLE to cleaner.

Put your course playbacks and slides in the Input folder.

The project folder should have the following structure:

.
├── Course
│   └── <Course Name>
│       └── Playback
│           ├── 1.mp4
│           ├── 2.mp4
│           ├── 3.mp4
│           └── 4.mp4
├── LICENSE
├── README.md
└── Workfolder

Install the dependencies.

Create a conda environment and install the python dependencies:

conda create -n course2note python=3.10
conda activate course2note
pip install -r requirements.txt

Run the script.

python main.py

Tech Support

XunFei API (for audio transcription)
OpenAI ChatGPT API (for note generation from transcripts and slides)
DeepSeek API (for note generation from transcripts and slides)

Reason Why I Drop the Info in Slides

The context length of OpenAI API is limited.
The info in slides is complicated and not structured.
The OCR results are messy, but if use copy and paste, there will be a lot of information loss.

So I drop the info in slides and only keep the info in the audio, and wish you can read the slides for better understanding.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Course2Note

Brief

Features

Usage

Tech Support

Reason Why I Drop the Info in Slides

About

Uh oh!

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
Source		Source
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
config.toml		config.toml
main.py		main.py
requirements.txt		requirements.txt

License

EmptyBlueBox/Course2Note

Folders and files

Latest commit

History

Repository files navigation

Course2Note

Brief

Features

Usage

Tech Support

Reason Why I Drop the Info in Slides

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages