Conversational AI that gets context from a video course and provides teaching support to students, with an attitude of openness, eagerness, and lack of preconceptions, just as a beginner would.
- Python 3.11
ffmpeg
(for video to audio processing)docker
anddocker compose
(for running Milvus vector database)
Project is under development. Installation requires to clone the repository and install the project in editable mode. Follow the instructions in the Development section.
This open-source project provides a Command-Line Interface (CLI) application, shoshin
, dedicated to processing video files.
It leverages external APIs, thus requiring certain environment variables to be configured.
The shoshin
command enables various video processing operations:
Usage: shoshin [OPTIONS] COMMAND [ARGS]...
Options:
--help Show this message and exit.
Commands:
convert Converts video files to audio
transcribe Transcribes audio files to text
embeddings-load Compute embeddings for all documents in a folder and load them into Milvus
query Create a question about indexed documents
Here are a few examples demonstrating how to use shoshin
:
# Convert a video file to an audio file
$ shoshin convert video/lesson01.mp4 --output audio/lesson01.mp3
# Transcribe an audio file to a text file
$ shoshin transcribe audio/lesson01.mp3 --output text/lesson01.txt
# Load all documents in a folder into Milvus vector database
$ shoshin embeddings-load --language en transcriptions/
# Ask questions to the LLM that will be answered from the documents stored
$ shoshin query "What are the ethical implications of AI?"
The LLM prompt is instructed to use only indexed documents and not their knowledge base to avoid going off-track from the video lessons.
During the embeddings-load
is important to select a language via --language
to ensure a better word split is done
for every document. Check Haystack documentation for more details.
We welcome external contributions, even though the project was initially intended for personal use. If you think some parts could be exposed with a more generic interface, please open a GitHub issue to discuss your suggestion.
To create a virtual environment and install the project and its dependencies, execute the following command in your terminal:
# Create and activate a new virtual environment
python3 -m venv venv
source venv/bin/activate
# Upgrade pip and install all projects and their dependencies
pip install --upgrade pip
pip install -e '.[dev]'
# Install pre-commit hooks
pre-commit install
# Create folders and initial database to store documents metadata
mkdir -p audio video transcriptions volumes
# Set required environment variables
cp .env.development .env
Set up the necessary environment variables in the newly created .env
file. All variables in there are required
for the project to run. You can see all available settings in the shoshin/conf/settings.py
module.
Example of a .env
file:
OPENAI_API_KEY=<insert-your-openai-api-key-here>
Finally, start required services to run Milvus vector database and PostgreSQL:
docker-compose up -d