chat-search

Chat with documents, search via natural language.

chat-search supports hybrid language models to add chat capabilities to website. RAG built with LangChain, Redis, various model providers (OpenAI, Ollama, vLLM, Huggingface).

Demo: Chat about my blog

Usage

Setup .env

cp .env.example .env

Populate .env file with the required environment variables.

Name	Value	Default
AUTH_TOKEN	auto token used for ingest
CHAT_PROVIDER	model provider, `openai` or `ollama`	`openai`
DEBUG	enable DEBUG, `1` or `0`	`0`
DIGEST_PREFIX	prefix for digest in Redis	`digest`
ENABLE_FEEDBACK_ENDPOINT	enable feedback endpoint, `1` or `0`	`1`
ENABLE_PUBLIC_TRACE_LINK_ENDPOINT	enable public trace link endpoint, `1` or `0`	`1`
EMBEDDING_DIM	embedding dimensions	`1536`
EMBEDDING_PROVIDER	embedding provider, `openai` or `ollama` or `huggingface`	`openai`
HEADERS_TO_SPLIT_ON	html headers to split text	`h1,h2,h3`
HF_HUB_EMBEDDING_MODEL	huggingface hub embedding model or Text Embeddings Inference url	`http://localhost:8080`
INDEX_NAME	index name	`document`
INDEX_SCHEMA_PATH	index schema path	(will use `app/schema.yaml`)
MERGE_SYSTEM_PROMPT	merge system prompt with user input, for models not support system role, `1` or `0`	`0`
LANGCHAIN_API_KEY	langchain api key for langsmith
LANGCHAIN_ENDPOINT	langchain endpoint for langsmith	`https://api.smith.langchain.com`
LANGCHAIN_PROJECT	langchain project for langsmith	`default`
LANGCHAIN_TRACING_V2	enable langchain tracing v2	`true`
LLM_TEMPERATURE	temperature for LLM	`0`
OLLAMA_CHAT_MODEL	ollama chat model	`gemma`
OLLAMA_EMBEDDING_MODEL	ollama embedding model	`nomic-embed-text`
OLLAMA_URL	ollama url	`http://localhost:11434`
OPENAI_API_BASE	openai compatible api base url
OPENAI_API_KEY	openai api key	`EMPTY`
OPENAI_CHAT_MODEL	openai chat model	`gpt-3.5-turbo`
OPENAI_EMBEDDING_MODEL	openai embedding model	`text-embedding-3-small`
OTEL_SDK_DISABLED	disable OpenTelemetry, `false` or `true`	`false`
OTEL_SERVICE_NAME	OpenTelemetry service name, also used for Pyroscope application name	`chat-search`
PYROSCOPE_BASIC_AUTH_PASSWORD	Pyroscope basic auth password
PYROSCOPE_BASIC_AUTH_USERNAME	Pyroscope basic auth username
PYROSCOPE_SERVER_ADDRESS	Pyroscope server address	`http://localhost:4040`
PYROSCOPE_ENABLED	Enable Pyroscope or not, `1` or `0`	`1`
REDIS_URL	redis url	`redis://localhost:6379/`
REPHRASE_PROMPT	prompt for rephrase	check config.py
RETRIEVAL_QA_CHAT_SYSTEM_PROMPT	prompt for retrieval	check config.py
RETRIEVER_SEARCH_KWARGS	search kwargs for redis retriever as json	check config.py
RETRIEVER_SEARCH_TYPE	search type for redis retriever	`mmr`
TEXT_SPLIT_CHUNK_OVERLAP	chunk overlap for text split	`200`
TEXT_SPLIT_CHUNK_SIZE	chunk size for text split	`4000`
VERBOSE	enable verbose, `1` or `0`	`0`

Start Ollama (Optional)

Follow Ollama instructions

ollama serve
ollama pull gemma
ollama pull nomic-embed-text

Run on host

Install dependencies

pip install poetry==1.7.1
poetry shell
poetry install

Start dependencies

Start redis

docker compose -f compose.redis.yaml up

Launch LangServe

langchain serve

Visit http://localhost:8000/

Run in Docker

There is a compose.yml file for running the app and all dependencies in containers. Suitable for local end to end testing.

docker compose up --build

Visit http://localhost:8000/

Run in Kubernetes

There is a helm chart for deploying the app in Kubernetes.

Config Helm values

Using Helm

cp values.example.yaml values.yaml

Then update values.yaml accordingly.

Add helm repos:

helm repo add chat-search https://hemslo.github.io/chat-search/
helm repo add redis-stack https://redis-stack.github.io/helm-redis-stack/
helm repo add ollama-helm https://otwld.github.io/ollama-helm/

Install/Upgrade chat-search

helm upgrade -i --wait my-chat-search chat-search/chat-search -f values.yaml

Using Skaffold for local development

skaffold run --port-forward

Ingest data

crawl --sitemap-url $SITEMAP_URL --auth-token $AUTH_TOKEN

Check crawl.yml for web crawling,

Example auto ingest after Github Pages deploy, jekyll.yml.

Architecture

Ingest

flowchart LR
  A(Crawl) --> |doc| B(/ingest)
  B --> |metadata| C(Redis)
  B --> |doc| D(Text Splitter)
  D --> |docs| E(Embedding Model)
  E --> |docs with embeddings| F(Redis)

Query

flowchart LR
  A((Request)) --> |messages| B(/chat)
  B --> |messages| C(LLM)
  C --> |question| D(Embedding Model)
  D --> |embeddings| E(Redis)
  E --> |relevant docs| F(LLM)
  B --> |messages|F
  F --> |answer| G((Response))

Deployment

Check cicd.yml for Google Cloud Run deployment, deploy-to-cloud-run.

Name		Name	Last commit message	Last commit date
Latest commit History 69 Commits
.github/workflows		.github/workflows
apm		apm
app		app
charts/chat-search		charts/chat-search
.env.example		.env.example
.gitignore		.gitignore
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
compose.apm.yaml		compose.apm.yaml
compose.redis.yaml		compose.redis.yaml
compose.yaml		compose.yaml
poetry.lock		poetry.lock
pyproject.toml		pyproject.toml
service.template.yaml		service.template.yaml
skaffold.yaml		skaffold.yaml
values.example.yaml		values.example.yaml

License

hemslo/chat-search

Folders and files

Latest commit

History

Repository files navigation