Video2Chat

Video2Chat is an innovative system that analyzes cooking videos and provides users with an interactive chat interface to engage with the content. Powered by a RAG (Retrieval-Augmented Generation) based system, it offers intelligent responses to user queries based on the extracted information from the videos.

1. Acquire Data

To begin, follow these steps to acquire and transcribe the necessary data:

Download Video: Utilize download_video.py to download 102 videos from YouTube. Transcribe Video: Utilize transcribe_video.py to transcribe the videos.

2. Preprocess Data

Prepare the data for analysis by performing the following preprocessing steps:

Extract Cooking Steps: Utilize extract_cooking_steps_images.py to extract the cooking steps from the videos. Extract Discussion Timestamps: Identify when the cooking steps are being discussed within the videos. Video Processing: Utilize video_processor.py to extract images/screenshots from the YouTube videos. All processed data is stored in the "processed_data" folder.

3. Build RAG

Construct the RAG using ChromaDB with OpenAI embeddings:

Implementation: The RAG implementation is stored in memory/chromadb.py. Building the RAG: Implement the RAG construction in test.py and building_cooking_vector_db.py. Persistent Database: The database (chroma.sqlite3) is persistently stored in memory/db/chroma.

4.Chat Interface

How to set it up

Install the requirements pip install -r requirements.txt
Create a .env with the following keys cp env_cp .env
Running the fastapi server python app.py
Returns .html file path

5. Handles multiple recipe and combines to cook for you

The system handles multiple recipes and offers customization options:

Matching Dishes: Based on user queries, relevant dishes stored in the vectordb are retrieved. If a match is found below the threshold of 0.15, it's considered a duplicate, and only one instance is returned. Otherwise, dishes with matches below a threshold of 0.20 are returned. Customization: Users can provide prompts to create custom dishes using GPT. Image Generation: Dishes are accompanied by step-by-step image generation using Dalle-3. HTML Rendering: Results are rendered in an HTML file for user interaction.

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
chats		chats
processed_data		processed_data
recipes		recipes
transcripts		transcripts
video2chat		video2chat
.DS_Store		.DS_Store
.gitignore		.gitignore
README.md		README.md
app.py		app.py
create_db.py		create_db.py
env_cp		env_cp
main.py		main.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Video2Chat

1. Acquire Data

2. Preprocess Data

3. Build RAG

4.Chat Interface

How to set it up

5. Handles multiple recipe and combines to cook for you

About

Releases

Packages

Languages

rc-rajatchawla/Video2Text

Folders and files

Latest commit

History

Repository files navigation

Video2Chat

1. Acquire Data

2. Preprocess Data

3. Build RAG

4.Chat Interface

How to set it up

5. Handles multiple recipe and combines to cook for you

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages