Outlook Add-in to use Generative AI features (Email composition, Email Thread Summarization (WIP), Inbox Q&A(WIP)) securely and privately. It uses a local LLM served via Nvidia TensorRT-LLM.
This system has two componentes: 1) An Outlook Add-in front end (React, Office Add-in framework) 2) An LLM inference backend (Python, Flask, TensorRT-LLM)
To get the system running:
-
Clone this repository:
git clone https://github.com/fgblanch/OutlookLLM.git
-
Install LLM dependencies:
2.1 Install TensorRT-LLM for Windows using the instructions here.
2.2 Download or Build your TensorRT-LLM LLM model of choice. The model needs to be Instruct Tuned (Llama format). - I used Mistral 7B Intruct tuned from HuggingFace: Mistral-7B-Instruct-v0.2, and converted to TensorRT-LLM using the instructions here - Other models tested are [Llama2 7B HF Chat] (https://huggingface.co/meta-llama/Llama-2-7b-chat-hf) and Gemma 7b IT
-
Outlook Add-in generation and sideloading. (WIP)
-
Install LLM Backend dependencies
pip install -r requirements.txt
-
Configure LLM Backend Https certificates (WIP)
-
Run LLM Backend (WIP)
-
Enjoy! ;)
- Build RAG on Backend for Inbox and Calendar Q&A