DataBase Question Answering (DBQA)

DBQA is an application that leverages Retrieval-Augmented Generation (RAG) to provide answers to questions that are related to local documents.
It supports a wide array of document formats, such as txt, html, word, pdf, epub, and more.

Dependency Installation

For Ubuntu

apt-get install python-dev-is-python3 libxml2-dev libxslt1-dev antiword unrtf poppler-utils pstotext
pip install -r requirements.txt

For Archlinux

pacman -S python libxml2 libxslt antiword unrtf poppler pstotext
pip install -r requirements.txt

Usage

Note: The local TAIDE model is expected located at /var/models/llama2-7b-chat-b5.0.0

Construct the vector database from local documents

python construct_vector_db.py /path/to/documents /path/to/output/directory

Ask question about your documents

python dbqa.py /path/to/database/directory "Your question"

Development

Architecture

flowchart LR

subgraph "Retrival-Augmented Generation (RAG)"
    direction LR
    *A(Start) --> AA[/Query/]
    AB[(Vector DB)]
    AC[/Prompt template/]
    AA ---> AD[Retriever]
    AB ---> AD
    AA ---> AE[Reranker]
    AD -- Documents --> AE
    AC --> AF[Formater]
    AA ---> AF
    AE -- Related documents --> AF
    AF -- Prompt --> AG["Generator (LLM)"]
    AG --> AH[/Answer/]
end


subgraph Database Construction
    direction LR
    *B(Start) --> BA[/Raw documents/]
    BA --> BB[Text extractor]
    BB -- Docuemtnes --> BC[Splitter]
    BC -- Chunks --> BD[Embedding model] 
    BD -- Sentence\nEmbedding --> BE[(Vector DB)]
    BC -- Chunks --> BE 
end

Class Documentation

ChatTuple: Grouped chat record for rendering prompt
DocumentStore: Encapsulation of the vector database and the embedding model
TaideChatModel: LangChain-compatible ChatModel for TAIDE
TextractLoader: Universal file text extractor
ParallelSplitter: Splitter with multiprocessing

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
lib		lib
.env		.env
.gitignore		.gitignore
README.md		README.md
construct_gov_qa_db.py		construct_gov_qa_db.py
construct_grb_db.py		construct_grb_db.py
construct_vector_db.py		construct_vector_db.py
dbqa.py		dbqa.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

DataBase Question Answering (DBQA)

Dependency Installation

For Ubuntu

For Archlinux

Usage

Development

Architecture

Class Documentation

About

Uh oh!

Releases

Packages

Uh oh!

Languages

ifTNT/database-qa

Folders and files

Latest commit

History

Repository files navigation

DataBase Question Answering (DBQA)

Dependency Installation

For Ubuntu

For Archlinux

Usage

Development

Architecture

Class Documentation

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages