Skip to content

ljhnick/omniquery

Repository files navigation

OmniQuery: Contextually Augmenting Captured Multimodal Memories to Enable Personal Question Answering

What is OmniQuery

OmniQuery enables free-form question answering on massive personal photo album data (photos and videos). It leverages RAG and applies Contextual Augmentation to the indexing process to enhance semantic retrieval given any queries. For more details and examples, please refer to our paper and project page.

To-Do List

  • Release source code for OmniQuery.
  • Add parameter control (e.g., topK).
  • Enable support for open-source language models (LLMs).

Installation

Prerequisites

Make sure you have Homebrew and Anaconda installed in your machine.

Clone this repo:

git clone https://github.com/ljhnick/omniquery.git

Install exiftool and ffmpeg globally.

brew install exiftool
brew install ffmpeg

Create a conda environment and upgrade pip and wheel.

conda create --name omniquery python=3.10.14 -y
conda activate omniquery
pip install --upgrade pip wheel

Install dependencies

pip install -r requirements.txt

Set up API keys

OmniQuery currently uses Google's cloud vision API for OCR and openAI's API family for captioning, reasoning, question answering, etc. Thus before running OmniQuery on your own images, you need to set up the API keys first.

For Google cloud vision API, plese follow the link to generate a local credential (.json file).

Create a .env file in the root folder and add your API keys:

OPENAI_API_KEY="your_api_key"
GOOGLE_APPLICATION_CREDENTIALS="<path_to_credential_json>"

Running OmniQuery

  1. Prepare Your Data: Download the photos and videos from your iOS device to the <root>/data/raw/ folder (we recommend using ImageCapture to transfer photos and videos if stored locally on your phone).

  2. Then index your memory with contextual augmentation (will take a while):

python init.py
  1. Once the indexing is finished, start the web app:
python frontend/app.py
  1. Navigate to localhost:5000. You should see OmniQuery running in your browser.

  2. Every time you run OmniQuery, you should navigate to Settings and Initialize the app first (the button will be grayed out once you successfully initialized the app).

  3. Additionally, you can toggle between OmniQuery (Full) and OmniQuery Lite. OmniQuery Lite is faster and cheaper but it does not perform query augmentation. We suggest using OmniQuery Lite first to try out some questions.

Credits

OmniQuery is created by Jiahao Nick Li, Zhuohao Zhang, and Jiaju Ma.

Citation

@misc{li2024omniquerycontextuallyaugmentingcaptured,
    title={OmniQuery: Contextually Augmenting Captured Multimodal Memory to Enable Personal Question Answering}, 
    author={Jiahao Nick Li and Zhuohao Jerry Zhang and Jiaju Ma},
    year={2024},
    eprint={2409.08250},
    archivePrefix={arXiv},
    primaryClass={cs.HC},
    url={https://arxiv.org/abs/2409.08250}, 
}

License

The software is available under the MIT License.

Contact

If you have any questions, feel free to open an issue or contact Jiahao Nick Li.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published