Architecture

Akcio is a chatbot system that utilizes a retrieval augmented large language model (LLM) to provide users with more accurate and relevant information. The system is designed to retrieve related information from a knowledge base through semantic search and then generate an answer using a large language model or service such as GPT-3.5.

The architecture of Akcio ensures that the chatbot can provide users with a fluent conversation and more accurate answers. By retrieving information from a knowledge base, the factuality of LLM is improved, making Akcio a reliable and effective system for information retrieval.

The entire architectures requires three key factors:

LLM service: The LLM service is responsible for question-answering by generating answers in response to user queries.
Knowledge base: The knowledge base stores documents and can provide additional supportive information for LLM to generate more helpful responses. The data in the knowledge base should enable semantic search and/or keyword matching.
Memory: Memory stores chat history to facilitate the retrieval of contextual information during conversations.

It can be divided into two main phases:

Insert: The Insert phase in the Akcio architecture is responsible for building the knowledge base by archiving documents in advance.
Query: The Query phase in the Akcio architecture is responsible for processing user queries, retrieving related information from the knowledge base, and generating accurate and relevant responses.

Akcio provides two options to construct the architecture with different designs of modules.

Option 1: Towhee

The option using Towhee simplifies the process of building a system by providing pre-defined pipelines. These built-in pipelines require less coding and make system building much easier. If you require customization, you can either simply modify configuration or create your own pipeline with rich options of Towhee Operators.

Pipelines

Insert: The insert pipeline builds a knowledge base by saving documents and corresponding data in database(s).
Search: The search pipeline enables the question-answering capability powered by information retrieval (semantic search and optional keyword match) and LLM service.
Prompt: A prompt operator prepares messages for LLM by assembling system message, chat history, and the user's query processed by template.

Memory

The memory storage stores chat history to support context in conversation.

Supported methods:

most SQL

Option 2: LangChain

The option using LangChain employs the use of Agent in order to enable LLM to utilize specific tools, resulting in a greater demand for LLM's ability to comprehend tasks and make informed decisions. It comprises five modules: Agent, LLM, Embedding, Store, and DataLoader.

Agent

An agent utilizes llm to determine actions and procedures, assembling modules together to create a workflow for a specific task. The workflow steps are defined through prompt templates, which accept tools and inputs as variables and return formatted outputs.

The default module ChatAgent is built on top of LangChain Conversation Agent. It can construct a chat system with action options from given tools like document retrieval. It is also equipped with llm and memory modules to provide greater flexibility.

Here are some useful links:

Configuration: configure prompt templates used in Agent
Customization: write your own Agent module

LLM

The LLM module is responsible for generating response based on the user's query and the context retrieved from stores. Besides providing AI response for users, it can also help to analyze the underlying user needs based on context or assume potential user questions given document.

The chatbot normally uses ChatLLM as the default LLM module to generate responses as the final answer, which will be returned to the user end. In order to adapt the LangChain agent, this must be a LangChain BaseChatModel.

By default, it uses ChatOpenAI from LangChain, which calls the chat service of OpenAI. Refer to LangChain Language Models for more LLM options.

Supported methods:

OpenAI Chat
HuggingFace Dolly
Ernie Bot
MiniMax
More models or services (todo)

Here are some useful links:

Configuration: configure LLM module such as switching model, modifying temperature, etc.
Customization: write your own LLM module

Embedding

The Embedding module provides methods to convert unstructured data to embeddings, which will be stored in VectorStore. In order to adapt LangChain agent and the vector store, this module must follow LangChain Embeddings.

The default module TextEncoder is implemented in VectorStore as a built-in embedding method.

Supported methods:

Here are some useful links:

Configuration: configure Embedding module such as switching model, disabling normalization.
Customization: write your own Embedding module.

Store

Store archives data in database and enables information retrieval. There are 3 stores in the Akcio system: vector store, scalar store, and memory store.

Visit Configuration to learn about how to configure each store such as switching model, disabling normalization, setting up connections.

VectorStore

The VectorStore is also responsible for embedding storage and semantic search in the workflow. It accepts either text or embedding inputs. For text inputs, it uses a predefined embedding method to convert inputs into embeddings.

Supported methods:

Milvus/Zilliz Cloud
Other vector databases (todo)

Here are some useful links:

Customization: write your own Embedding module.

ScalarStore

The Scalar Store is where the metadata of the documents is stored. It allows for efficient keyword-based search to provide additional information retrieval.

Supported methods:

Elastic
Other databases (todo)

Here are some useful links:

Customization: write your own ScalarStore module.

MemoryStore

The Memory Store is responsible for storing and recalling the chat history. It stores information such as user queries, chatbot responses, and session details. With this module, the system is able to support fluent multi-round conversations.

Supported methods:

Postgresql
Other databases (todo)

Here are some useful links:

Customization: write your own MemoryStore module.

DataLoader

The DataLoader prepares data ready to be stored. For basic usage, it only needs DataParser to load data from source as smaller chunks. For advanced deployment, the system additionally extracts potential questions for doc chunks using QuestionGenerator.

Visit Configuration to learn about how to configure this module, such as changing chunk size.

DataParser

The DataParser parses documents from given data source and split documents into a list of doc chunks. By default, it allows files or urls as data source, and uses LangChain RecursiveCharacterTextSplitter to split documents.

Supported methods:

LangChain text splitter
Customized markdown splitter
Other text splitters (todo)

Here are some useful links:

Customization: write your own DataParser module, for example, using a different text splitter.

QuestionGenerator

The QuestionGenerator is used to build advanced knowledge base. It splits a document as smaller chunks and then uses LLM service, such as ChatGPT, to generate potential questions with answers available in the document. In this case, the system stores embeddings of extracted questions instead of the original documents. In the query flow, the system will searches for similar questions from vector store, and then returns the source document of retrieved questions.

Here are some useful links: