E2E-AI-Chatbot 🤖

Overview

End-to-end AI Chatbot is a project that aims to build a chatbot that can answer any question in any domain. The project is built on top of the workflow chatbot Q&A with Gradio and GPT4All model. The project is currently in the development stage.

Gradio is a Python library that allows you to quickly create customizable UI components around your machine learning models, deep learning models, and other functions. Mix and match components to support any combination of inputs and outputs. Built-in support for NLP, images, plotting, and more.
GPT4All is an ecosystem to train and deploy powerful and customized large language models that run locally on consumer grade CPUs. Note that your CPU needs to support AVX or AVX2 instructions. The goal is simple - be the best instruction tuned assistant-style language model that any person or enterprise can freely use, distribute and build on. A GPT4All model is a 3GB - 8GB file that you can download and plug into the GPT4All open-source ecosystem software. Nomic AI supports and maintains this software ecosystem to enforce quality and security alongside spearheading the effort to allow any person or enterprise to easily train and deploy their own on-edge large language models.

Tech Stack

#	Name	Description
1	Python	Programming language used to build the project.
2	FastAPI	Web framework backend
3	Gradio	Backend UI low code
4	Docker	Package application
5	GPT4All	Model GPT offline
6	Redis	Cache chat history
7	MongoDB	Database save document
8	Mongo Express	Database UI
9	Logstash	Data migration
10	Elasticsearch	Search engine
11	Kibana	Search monitoring
12	Nginx	HTTP and reverse proxy server
13	Kubernetes	Container orchestration
14	AWS/Azure	Cloud
15	Pre-commit	Linting

Pipeline

Current:

Next stage:

Installation Requirements

Minimum CPU 8GiB RAM
Uncomment line 8 packages = [{include = "**"}] to use all internal packages (Passing Flake8)
Install packages and download GPT4All model by

Run locally

With poetry:

chmod u+x ./setup.sh
bash ./setup.sh

With pip:

pip install poetry
poetry shell
poetry install

Build MongoDB, Mongo Express, Logstash, Elasticsearch, Kibana and Redis

docker compose -f docker-compose-service.yml up
poetry run python app.py --host 0.0.0.0 --port 8071

Run docker

docker compose up

User Interface App

make run

Run on: http://localhost:8071

Login:

Account: admin
Password: admin

Chatbot:

New version:

Ingest PDF:

(back to top)

Model

GPT4ALL: Current best commercially licensable model based on GPT-J and trained by Nomic AI on the latest curated GPT4All dataset.

Database

MongoDB Run on: http://localhost:27017

poetry run python src/ingest_database.py --mongodb-host "mongodb://localhost:27017/" --data-path "static/pdf/"

Mongo Compass (Windows)

Mongo Express

Run on: http://localhost:8081

Data Migration

Run on: http://localhost:9600

Search

Elasticsearch & Kibana

poetry run python src/ingest_search.py --mongodb-host "mongodb://localhost:27017/" --es-host "http://localhost:9200/" --index_name "document"

Elasticsearch run on: http://localhost:9200

Kibana run on: http://localhost:5601

Cache Chat History

For generating question takes a long time 97(s)/question, we use Redis to cache chat history.

Quickly generate question 0.5(s)/question

Contact

KhoiVN - @linkedin-khoivn8071 - nguyenkhoi8071@gmail.com
Project Link: Github-E2E-AI-Chatbot
Website: khoivn.space

Impressive

From Langchain Framework: https://github.com/hwchase17/langchain
From GPT4All: https://github.com/nomic-ai/gpt4all

(back to top)

Name		Name	Last commit message	Last commit date
Latest commit History 59 Commits
.github		.github
databases		databases
docs		docs
extractors		extractors
llms		llms
loggers		loggers
logstash		logstash
memories		memories
routers		routers
searchers		searchers
src		src
static		static
templates/frontend		templates/frontend
.flake8		.flake8
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
Dockerfile		Dockerfile
LICENSE.txt		LICENSE.txt
Makefile		Makefile
README.md		README.md
__init__.py		__init__.py
app.py		app.py
babel.config.js		babel.config.js
config.py		config.py
docker-compose-service.yml		docker-compose-service.yml
docker-compose.yml		docker-compose.yml
docusaurus.config.js		docusaurus.config.js
download_model.sh		download_model.sh
package-lock.json		package-lock.json
package.json		package.json
poetry.lock		poetry.lock
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt
setup.sh		setup.sh
sidebars.js		sidebars.js

License

vnk8071/E2E-AI-Chatbot

Folders and files

Latest commit

History

Repository files navigation