Use AI to index your memes by their content and text, making them easily retrievable for your meme warfare pleasures.
All processing - from image-to-text extraction, to vector embedding, to search - is performed locally.
This repository contains code, a walkthrough notebook (meme_search_walkthrough.ipynb
), and apps for indexing, searching, and easily retrieving your memes based on semantic search of their content and text.
A table of contents for the remainder of this README:
This repo contains two versions of the meme search app. Both versions can be used for core meme search organization and retrieval, with the pro version offering a significantly expanded feature set at the cost of more complex architecture.
-
The standard version: a simple one page app that contains all the base functionality you need. Simple to install and configure.
-
The pro version: a multi-page app with enhanced UI and additional features driven by the community - like description editing, meme tagging, and multi-path indexing. Requires larger memory footprint.
The standard version of meme search is a simple one page app that allows you to index a directory of memes and recover them via text based search as illustrated below.
While not as feature rich as the [pro version of meme search], the standard version provides all the base functionality you need to organize and recover your memes. The standard version is also simpler to install and configure, consisting of a single server / docker container.
To create a handy tool for your own memes pull the repo and install the requirements file
pip install -r requirements.txt
Note that the particular pinned requirements here are necessary to avoid a current nasty segmentation fault involving sentence-transformers
as of 6/5/2024.
Alternatively you can install all the requirements you need using docker via the compose file found in the repo. The command to install the above requirements and start the server using docker-compose is
docker compose up
After indexing your memes you can then start the server (a streamlit app), allowing you to semantically search for and retrieve your memes
python -m streamlit run meme_search/app.py
To start the app via docker-compose use
docker compose up
Note: you can drag and drop any recovered meme directly from the streamlit app to any messager app of your choice.
Place any images / memes you would like indexed for the search app in this repo's subdirectory
data/input/
You can clear out the default test images in this location first, or leave them.
Next, click the "refresh index" button to update your index when images are added or removed from the image directory, affecting only the newly added or removed images.
Alternatively - at your terminal - paste the following command
python meme_search/utilities/create.py
or if running the server via docker us
docker exec meme_search python meme_search/utilities/create.py
You will see printouts at the terminal indicating success of the 3 main stages for making your memes searchable. These steps are
-
extract: get text descriptions of each image, including ocr of any text on the image, using the kickass tiny vision-llm moondream
-
embed: window and embed each image's text description using a popular embedding model - sentence-transformers/all-MiniLM-L6-v2
-
index: index the embeddings in an open source and local vector base faiss database and references connecting the embeddings to their images in the greatest little db of all time - sqlite
This meme search pipeline is written in pure Python and is built using the following open source components:
- moondream: a tiny, kickass vision language model used for image captioning / extracting image text
- all-MiniLM-L6-v2: a very popular text embedding model
- faiss: a fast and efficient vector db
- sqlite: the greatest database of all time, used for data indexing
- streamlit: for serving up the app
The notebook linked to here walks through the whole process! You can also watch an overview of this walkthrough by clicking here .
Tests can be run by first installing the test requirements as
pip install -r requirements.test
Then the test suite can be run as
python -m pytest tests/
The pro version of meme search builds on the standard version, adding an array of features requested by the community.
These additional features include:
-
Auto-Generate Meme Descriptions
Target specific memes for auto-description generation (instead of applying to your entire directory).
-
Manual Meme Description Editing
Edit or add descriptions manually for better search results, no need to wait for auto-generation if you don't want to.
-
Tags
Create, edit, and assign tags to memes for better organization and search filtering.
-
Faster Vector Search
Powered by Postgres and pgvector, enjoy faster keyword and vector searches with streamlined database transactions.
-
Keyword Search
Pro adds traditional keyword search in addition to semantic/vector search.
-
Directory Paths
Organize your memes across multiple subdirectories—no need to store everything in one folder.
-
New Organizational Tools
Filter by tags, directory paths, and description embeddings, plus toggle between keyword and vector search for more control.
To start up the pro version of meme search pull this repository and start the server cluster with docker-compose
docker compose -f docker-compose-pro.yml up
This pulls and starts containers for the app, database, and auto description generator. The app itself will run on port 3000
and is available at
http://localhost:3000
To start the app alone pull the repo and cd into the meme_search/meme_search_pro/meme_search_app
. Once there execute the following to start the app in development mode
./bin/dev
When doing this ensure you have an available Postgres instance running locally on port 5432
.
With the pro version you can index your memes by creating your own descriptions, or by generating descriptions automatically, as illustrated below.
The pro version pipeline contains many of the components of the standard version, with some variationa and several additional components.
- the app - along with its enhanced features - is built using Ruby on Rails
- a ruby version [of the same embedding model] is used in place of the Pythonic version
- a single Postgres database is used in place of the duo used with the standard version
- the auto generator is isolated in its own image / container to allow for better maintainance, queueing, and cancellation
To run tests locally pull the repo and cd into the meme_search/meme_search_pro/meme_search_app
directory. Once there tests can be executed
rails test test/system
When doing this ensure you have an available Postgres instance running locally on port 5432
.
Meme Search is under active development! See the CHANGELOG.md
in this repo for a record of the most recent changes.
Feature requests and contributions are welcome!
See the discussion section of this repository for suggested enhancements to contribute to / weight in on!
Please see CONTRIBUTING.md
for some boilerplate ground rules for contributing.