Examplify

STILL WIP

Examplify is an offline CPU-first memory-scarce chat application to perform Retrieval-Augmented Generation (RAG) on your corpus of data. It utilises an 8-bit quantised openchat-3.5 model, running on CTranslate2's inference engine for maximum CPU performance.

Requirements

Docker Compose
10 GB RAM

Benchmarks

Model	Tokens	Time (s)	Throughput (t/s)	Device
zephyr-7b-beta-ct2-int8	219	2.272	96.396	NVIDIA RTX 3090
zephyr-7b-beta-ct2-int8	211	24.482	8.619	Intel i7-8700
openchat-3.5-ct2-int8	151	0.832	181.469	NVIDIA RTX 3090
openchat-3.5-ct2-int8	156	1.573	99.160	NVIDIA RTX 3080 Ti
openchat-3.5-ct2-int8	152	10.611	14.325	Intel i7-12800H
openchat-3.5-ct2-int8	151	9.696	15.574	Intel i7-8700
openchat-3.5-ct2-int8	151	9.667	15.620	Intel i7-1260P
openchat-3.5-ct2-int8	151	20.794	7.262	Intel i9-11900H

Setup

To setup the application, we must populate your .env file. You can do this with the following.

Important

OMP_NUM_THREADS should correspond to the number of physical cores available.

{
  echo BACKEND_URL=localhost
  echo BACKEND_PORT=443
  echo CT2_USE_EXPERIMENTAL_PACKED_GEMM=1
  echo OMP_NUM_THREADS=8
} > .env

Usage

You can start the application and access the Swagger UI at https://localhost/api/docs.

Warning

Before offline usage, you must run the application at least once with internet access to install any necessary dependencies.

make u

Development

Install all dependencies with the following.

poetry install

Delete cached models.

sudo make clean

Name		Name	Last commit message	Last commit date
Latest commit History 306 Commits
.github		.github
server		server
tests		tests
.dockerignore		.dockerignore
.gitignore		.gitignore
Caddyfile		Caddyfile
Dockerfile.backend		Dockerfile.backend
Dockerfile.caddy		Dockerfile.caddy
Makefile		Makefile
README.md		README.md
compose.gpu.yaml		compose.gpu.yaml
compose.yaml		compose.yaml
main.py		main.py
poetry.lock		poetry.lock
pyproject.toml		pyproject.toml
redis.conf		redis.conf

winstxnhdw/Examplify

Folders and files

Latest commit

History

Repository files navigation

Examplify

Requirements

Benchmarks

Setup

Usage

Development

About

Topics

Resources

Stars

Watchers

Forks

Languages