Using Semantic Search to retrieve memes
The Orginal Dataset is small dataset with 6k memes which are annotated by OpenFlamingo-9B and MiniGPT4 models for their zero-shot and few-shot experiments.
Final Dataset and Embeddings are also available on 🤗 HuggingFace
├── README.md <- The top-level README for developers using this project.
|
├── app
│ ├── app.py <- Gradio App
│ ├── requirements.txt <- Requirements for Gradio App
|
├── data
│ ├── input.csv <- Final dataset that was used for encoding
│ ├── meme-embeddings.pkl <- Embedding of the memes in the dataset
│ ├── raw_memes.json <- Raw Dataset from meme-cap
| ├── string_data.csv <- Dataset before concatenating input sequence
│ └── required_cols.csv <- Dataset with string columns
│
├── notebooks
| ├── EDA and Cleaning Data.ipynb <- Exploring and cleaning the raw data to input.csv
│ ├── Semantic Search.ipynb <- Using Sentence Transformer create semantic search based on cosine similarity
│
└── requirements.txt <- The requirements file for reproducing the analysis environment, e.g.
Clone the project
git clone https://github.com/bhavya-giri/retrieving-memes
Go to the project directory
cd retrieving-memes
Install dependencies
pip install requirements.txt
Start the notebook
jupyter notebook
or open with Juoyter Lab
jupyter lab