ALP

Overview

ALP is an open-source, knowledge-based conversational AI system, crafted to produce responses that are rooted in relevant information from external sources. 📖💫

ALP allows you to build a large knowledge base that can be queried while interacting with a chatbot. Similarity based context constructions allows for better relevance of materials extracted from the database. The chatbot has unlimited conversational memory and the ability to export conversation and source embeddings to JSON format.

ALP maintains conversation history and embeddings in a local SQLite database 🗄️. As a result, document uploading and embedding processes are required only once, enabling users to resume conversations seamlessly.

ALP is intended to be run via localhost 💻. All you need is Python and few commands to setup environment. Feel free to fork, explore the code, and adjust to your needs 🔧.

Changelog

20231226 - gpt-4-1106-preview added as default generative model. User can change it in ./lib/params.py in PROD_MODEL. Collection creation page count bug fix.
20230911 - define custom agent behavior in ./static/data/simulacra.json and choose them in session manager from a drop-down menu.
20230406 - bug-fix that prevented program to initialize database under /static/data/dbs/; requirements.txt fix
20230705 - new data model, SBERT based embeddings
20230411 - interface upgrade, UX improvements, bugfixes
20230405 - program stores converastion history

Introduction

ALP enhances the accuracy of responses of GPT-based models relative to given PDF documents by using a retrieval augmentation method. This approach ensures that the most relevant context is always passed to the model. The intention behind ALP is to assist exploration of the overwhelming knowledge base of research papers, books and notes, making it easier to access and digest content.

Currently ALP utilizes following models:

Text embedding: multi-qa-MiniLM-L6-cos-v1
Generation: gpt-4-1106-preview', gpt-4-32k, gpt-4, gpt-3.5-turbo-16k, gpt-3.5-turbo.

Features

Conversational research assistant: Interact with and get information from collections of pdf files.
Unlimited conversational memory: Retain information from previous conversations for context-aware responses.
Come back to past conversations: Pick up your conversation right where you left.
Flexible data model: Allows for conversation with more than one document in one session.
Uses open source models: Sentence-Transformers allow for costless and fast text embedding.
Support for long documents: upload books and other lengthy pdfs.
Retrieval augmentation: Utilize retrieval augmentation techniques for improved accuracy. Read more here.
Define your own agent behavior: set system message defining your agents in ./static/data/simulacra.json. You can then easily adjust html form to choose names from a drop-down menu.
Local deployment: Spin up ALP locally on your machine.
JSON export: Export conversation as a JSON file.
Open source: The code is yours to read, test and break and build upon.

Installation

To set up ALP on your local machine, follow these steps:

Install Python:

Make sure you have Python installed on your machine. I recommend Anaconda for an easy setup.

Important: ALP runs on Python 3.10

Fork and clone the repository:

After forking the repo clone it in a command line:

git clone https://github.com/yourusername/alp.git 
cd ALP

Create a virtual environment and activate it:

From within the ALP/ local directory invoke following commands

For Linux users in Bash:

python3 -m venv venv
source venv/bin/activate

For Windows users in CMD:

python -m venv venv
venv\Scripts\activate.bat

This should create ALP/venv/ directory and activate virtual environment. Naturally, you can use other programs to handle virtualenvs.

Install the required dependencies to virtual environment:

pip install -r requirements.txt

Configuration

By default, ALP runs in localhost. It requires API key to connect with GPT model via Open AI API. in the ALP/ directory create an api_key.txt and paste your API key there. Make sure api_key.txt is added to your .gitignore file so it doesnt leak to github. You can get your Open AI API key here https://platform.openai.com 🌐

Usage

Run the ALP application:

python alp.py

Access the application:

The app should open in your default web browser. If it doesn't, navigate to http://localhost:5000. First use involves creation of app.db file under ALP/static/data/dbs/. This is your SQLite database file that will hold conversation history and embeddings. Also, the script will download 'multi-qa-MiniLM-L6-cos-v1' (80 MB) to your PC from Hugging Face repositories. It will happen automatically on a first launch.

Start using ALP:

ALP app interface consists of couple of sections:

HUB allows you to create new collection under 'Collections' or create/continue conversation session under 'Sessions'.
COLLECTIONS MANAGER: create a new collection by specifying its name and uploading a pdf file.
SESSION MANAGER:
- continue existing session by selecting it from dropdown and hitting 'Start'.
- create a new session there as well by defining its name and selecting collections linked to the session.
- Select predefined agent role for the conversation. Works also for historical chats.
SESSION window: talk to GPT model over collections linked to the session. Each time a similarity score is calculated between user's query and collections' embeddings, a list of most similar datasources is printed in the program's console. These sources will be passed as a context with the user's question to the GPT model.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

lib

lib

static

static

templates

templates

.gitignore

.gitignore

README.md

README.md

alp.py

alp.py

requirements.txt

requirements.txt

Repository files navigation

ALP

Overview

Table of Contents

Changelog

Introduction

Features

Installation

Configuration

Usage

Screenshots

Todo

About

Releases

Packages

Contributors 2

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 150 Commits
lib		lib
static		static
templates		templates
.gitignore		.gitignore
README.md		README.md
alp.py		alp.py
requirements.txt		requirements.txt

rpast/ALP

Folders and files

Latest commit

History

Repository files navigation

ALP

Overview

Table of Contents

Changelog

Introduction

Features

Installation

Configuration

Usage

Screenshots

Todo

About

Topics

Resources

Stars

Watchers

Forks

Languages