Super-Hat 🎩 - Local AI document server

Upload documents * Ask Question * Local & Private

Introduction

Superhat enables you to create a private secure server that can store documents and answer questions using AI. You can ask it it generate reports with graphs and charts as well. It stores CSV/Sheets in SQL database, so you don't have to worry about huge tables. You can upload thousand's of docs and ask questions, it will automatically figure out what documents to use to get the relevant answer. All answers are backed by reference to the documents it retrieved, so you can double check (manually or through another AI). Best part is, it runs all the software components locally. Therefore, after installation you can cut it off from internet and it will still work.

Technical stack

Everything runs locally on your server

Service	Notes
Web Server	This is your primary gateway/frontend to use the service
Postgres	SQL database to store huge CSV/Sheets
Embedding Inference Server	Huggingface model to generate embeddings for RAG
ReRanker Inference Server	Huggingface model to re-rank documents for mem0 module
Weaviate	VectorDB of choice for RAG
vLLM Inference Server	Run open-source LLM of your choice for e.g. Qwen3/gpt-oss-20b
minio	s3 compatible storage engine used to manage uploaded documents
VectorDb Server	Superhat server for vectordb operation
API server	It exposes the upload/users/query to external world through API and tokens
Chat Server	Superhat server that handles all chat interaction and docs retreival
Ingestion Server	Superhat server that is responsible for indexing uploaded documents
Metadata Server	Superhat server responsible for keeping tracks of all document locations, life-cylce, sharing and ownership
keycloak	User authentication server

Getting started

pre-requisite:

docker : docker, docker-compose, docker-registry

Configure

One time setup and initialization

--- do this on your server ---
# Clone repo
$ git clone https://github.com/queryhat/super-hat.git

# All subsequent commands will be run from ".../local" directory
# deploy to local server using docker compose
$ cd super-hat/deployment/dev/local
$ cp .env.example .env
# At the minimum, you want to edit the following in .env
#   QHAT_LOCAL_VOLUME_ROOT: This is where all your data remains persistent e.g. database, vectordb, etc
#   VLLM_API_KEY, OPENAI_BASE_URL, VLLM_MODEL_ID: This controls how and where the LLM is accessed

# Now build docker images and initialize root-directory
$ ./setup_local.sh

# Start your service ... will take couple of minutes pulling images/LLM from internet for the very first time
$ docker compose up -d websvr

# Note down important port numbers, you will need them to access service
$ egrep 'QHAT_APISVR_SERVICE_PORT|QHAT_WEBSVR_SERVICE_PORT' .env
# QHAT_APISVR_SERVICE_PORT=8000
# QHAT_WEBSVR_SERVICE_PORT=8021

LLM Choice

You can either use OpenAI API compatible models, or the ones supported by vLLM (included). This is controlled through .env file.

OpenAI:

VLLM_PORT=443
VLLM_MODEL_ID="gpt-5-mini"
VLLM_API_KEY='sk-proj-...'
OPENAI_BASE_URL="https://api.openai.com/v1"

Groq:

VLLM_PORT=443
VLLM_MODEL_ID="openai/gpt-oss-120b"
VLLM_API_KEY='gsk_...'
OPENAI_BASE_URL="https://api.groq.com/openai/v1"

vLLM:
Make sure you have GPU with enough memory. Limited testing done on Qwen3/gpt-oss-*, and looks like gpt-oss model works better.

VLLM_MODEL_ID="openai/gpt-oss-20b"
VLLM_API_KEY='cSt6YXROaHNET0EwaGV5cQa6' <=== Generate/use any random key

# To use vLLM, further steps are needed
# 1. pull model weights from Huggingface
$ ./vllm-openai/init-gpt-oss.sh
$ docker compose up -d --force-recreate vllm-openai

Access service

Create ssh tunnel from your desktop/laptop to the superhat server

$ ssh -N -L 8021:localhost:8021 -L 8000:localhost:8000 superhat-server

Now you can access and use superhat from browser: "http://localhost:8021/login" (skip the /login in future visits) Each registered user can upload it's own files and ask question about it.

Limitations

Potential target to be improved in future

Supported file types
Only: pdf/docx/csv/google sheet supported right now.
Single turn chat
Every question you ask is standalone, no chat history is used/send
Inconsistent document reference
Reponse doesn't sometimes return reference to document, other times it doesn't i.e. inconsistent behaviour. Though it strictly answers from the added documents only.

Name		Name	Last commit message	Last commit date
Latest commit History 453 Commits
api		api
configs		configs
deployment/dev		deployment/dev
docs		docs
libs		libs
scripts		scripts
service		service
.dockerignore		.dockerignore
.gitignore		.gitignore
AWS_README.md		AWS_README.md
CODING_AGENT_GUIDE.md		CODING_AGENT_GUIDE.md
CONTRIBUTING.md		CONTRIBUTING.md
GCP_README.md		GCP_README.md
LICENSE		LICENSE
README.md		README.md
alembic.ini		alembic.ini
build_tool		build_tool
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Super-Hat 🎩 - Local AI document server

Introduction

Technical stack

Getting started

Configure

LLM Choice

Access service

Limitations

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 6

Languages

License

queryhat/super-hat

Folders and files

Latest commit

History

Repository files navigation

Super-Hat 🎩 - Local AI document server

Introduction

Technical stack

Getting started

Configure

LLM Choice

Access service

Limitations

About

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 6

Languages

Packages