easy-gpt4o

Blog Link: Easy-GPT4o - reproduce GPT-4o in less than 200 lines

Easy-GPT4O opensource version: use OpenAI older API implements GPT-4o in less than 200 lines of code.

Motivation

Why I start this project? This is just a toy project and a simple demo. I want to prove some ideas in this project:

Developers can build their own GPT-4o using existing APIs. By leveraging available tools, developers can easily access the capabilities of advanced models.
End-to-end models provide low latency but limited customization. This project explores the trade-off between latency and customization, highlighting the benefits and limitations of each approach.
The combined power of multiple models can outperform a single multimodal model. This project demonstrates the effectiveness of a collaborative approach, leveraging the collective intelligence of various models to achieve superior results.

Prerequisites

Python 3.6 or higher
OpenAI Python package (openai)
FFmpeg (for audio extraction)

Installation

Clone the repository:

git clone https://github.com/Chivier/easy-gpt4o

Install the required Python packages:
```
pip install -r requirements.txt
```
Download and install FFmpeg from the official website: https://ffmpeg.org/

Usage

# Set your own openai api
export OPENAI_API_KEY=xxxxxxx
python main.py input_video.mp4 output_audio.mp3

Replace input_video.mp4 with the path to your input video file, and output_audio.mp3 with the desired path to save the output audio file.

How to make it happen

Extracts audio from a video file
Transcribes the audio using OpenAI Whisper API
Generates image descriptions for key frames in the video using OpenAI GPT-4 Turbo API
Combines the audio transcription and image descriptions into a comprehensive response
Converts the response to speech using OpenAI TTS API

Demo

Demo 1

a.mp4

a1.mov

a2.mov

Demo 2

b.mp4

b.mov

TODO

Open-source Model Replace OpenAI API
Streaming video processing
Use RAG store long period memory

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
main.py		main.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

.gitignore

.gitignore

LICENSE

LICENSE

README.md

README.md

main.py

main.py

requirements.txt

requirements.txt

Repository files navigation

easy-gpt4o

Motivation

Prerequisites

Installation

Usage

How to make it happen

Demo

Demo 1

Demo 2

TODO

About

Releases

Packages

Languages

License

Chivier/easy-gpt4o

Folders and files

Latest commit

History

Repository files navigation

easy-gpt4o

Motivation

Prerequisites

Installation

Usage

How to make it happen

Demo

Demo 1

Demo 2

TODO

About

Resources

License

Stars

Watchers

Forks

Languages