An application for searching videos to find frames that match a text or image query. This application utilizes ConvNext CLIP models from OpenCLIP to compare video frames with the feature representation of a user text or image query.
- Python 3.8+
pip install -r requirements
The application is available in the Hugging Face Space here.
Or you can run it locally by doing the following:
- Run
python video-search-app.py
and open the given URL in a browser. - Select either the "Text Query Search" or "Image Query Search" tab.
- Upload your video and write a text query or upload a query image.
- Adjust the parameters and click submit.
- Note: The web app can have performance issues with long videos (more than a few minutes). Instead use the notebook for longer videos.
video-search-notebook.ipynb
provides an alternate UI with a few extra features.