Description

A web server for documents retrieving by ChatGPT. The service is built with NestJS and OpenAI API. The vector database is powered by Pinecone.

Usage

Environment Variables

OpenAI and Pinecone service are required to run the application. You can sign up for a free account and get the API keys.

Environment variables are Required to run the application. Set up env in the OS environment or create a .env file in the root directory of the project.

$ export OPENAI_API_KEY=sk-xxx
$ export PINECONE_API_KEY=xxx

All environment variables is listed below .env file:

# OpenAI api
## check https://platform.openai.com/account/api-keys to get your api key
OPENAI_API_KEY=sk-xxx

# Pinecone api
## check https://www.pinecone.io/docs/quickstart/ for more information
PINECONE_API_KEY=xxx
PINECONE_ENVIRONMENT=us-west4-gcp
PINECONE_INDEX=your-index-name

# Optional
# OPENAI_API_BASE_URL=your-openai-api-proxy-url
# OPENAI_API_DEFAULT_MODEL=gpt-3.5-turbo # default is "gpt-3.5-turbo"
# PIENCONE_NAMESPACE=your-pinecone-namespace # default is "" 
# PORT=3000  # default port is 3000
# JWT_SECRET=your-jwt-secret # If set, the api will be protected by jwt

Deployment

Run with Node.js

$ npm install
$ npm run build
$ npm run start:prod

Run with Docker

$ docker pull ocherry341/chat-meta:latest
# or build image from source
# $ docker build -t nest .
$ docker run -p 3000:3000 --env-file .env ocherry341/chat-meta

API Endpoints

`POST /insert`

Chunk a document and insert them into the vector database

Request

Field	Type		Description
text	string	Required	The text of the document
[metadata]	string \| number \| boolean \| string[]	Optional	Any other metadata you want to store with the document, accepts any key

Example

{
    "text": "Some text to store in the database",
    "user": "User1",
    "title": "Document Title",
    "tags": ["tag1", "tag2"]
}

Response

Field	Type	Description
upsertedCount	number	The number of chunked document inserted or updated
failedCount	number	The number of chunked document failed to insert or update
failedVectors	Array	The failed vectors

Example

{
    "upsertedCount": 1,
    "failedCount": 0,
    "failedVectors": []
}

`POST /insert-file`

Upload a file, chunk it and insert them into the vector database. Following formats are supported:

text/plain (.txt)
application/pdf (.pdf)
application/vnd.openxmlformats-officedocument.wordprocessingml.document (.docx)

Request

Field	Type		Description
file	file	Required	The uploaded file
[metadata]	string \| number \| boolean	Optional	Any other metadata you want to store with the document, accepts any key

Response

Field	Type	Description
upsertedCount	number	The number of chunked document inserted or updated
failedCount	number	The number of chunked document failed to insert or update
failedVectors	Array	The failed vectors

`POST /query`

Query the vector database and return the most similar documents

Request

Field	Type		Description
text	string	Required	The uploaded file
topK	number	Optional	The number of results to return
filter	object	Optional	Metadata filter

Response

{
    "results": [],
    "matches": [
        {
            "id": "MXNKIvY3irlofJ7XqDIML",
            "score": 0.867960095,
            "values": [],
            "metadata": {
                "text": "A text chunk from document, will be used to in chatgpt context",
                // any other metadata
                "author": "Robert",
                "source": "Technology Handbook",
                "tags": "[\"java\", \"python\", \"javascript\"]"
            }
        },
        ...
    ],
    "namespace": "your-namespace"
}

`POST /remove`

Request

Field	Type		Description
namespace	string	Optional	namespace to remove, default same as env PIENCONE_NAMESPACE
filter	object	Optional	Metadata filter
deleteAll	boolean	Optional	If true, delete all documents in the namespace

Note: Only one of filter and deleteAll must be provided.

Response

An empty object if success

`POST /chat`

Request

Field	Type		Description
messages	Array<Message>	Required	OpenAI chat completion messages
filter	object	Optional	Metadata filter. if set, only filted document will be used as context.
model	string	Optional	Model used to generate text, default same as env OPENAI_API_DEFAULT_MODEL. See list of models

// Message interface
interface Message {
    content: string;
    role: 'system' | 'user' | 'assistant';
    name?: string;
}

Response

A Server Sent Event stream, with only data field contains a string of the generated text. Finish with data: [DONE]

Authentication

If JWT_SECRET is set, the api will be protected by jwt. To access the api, you need to provide a jwt token in the Authorization header.

Otherwise, the api is open to public.

This service does not implement methods to generate jwt token. You need to use other services to generate jwt token with the same secret.

Development

Installation Dependencies

npm install

Running the app

# development
$ npm run start

# watch mode
$ npm run start:dev

# production mode
$ npm run start:prod

Test

# unit tests
$ npm run test

# e2e tests
## Note:
## Before running e2e tests, complie worker_thread .ts file 
## at src/modules/document/document-worker to .js by building the project into ./dist
# $ npm run build
$ npm run test:e2e

# test coverage
$ npm run test:cov

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
src		src
test		test
.dockerignore		.dockerignore
.eslintrc.js		.eslintrc.js
.gitignore		.gitignore
.prettierrc		.prettierrc
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
nest-cli.json		nest-cli.json
package-lock.json		package-lock.json
package.json		package.json
tsconfig.build.json		tsconfig.build.json
tsconfig.json		tsconfig.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Description

Usage

Environment Variables

Deployment

Run with Node.js

Run with Docker

API Endpoints

`POST /insert`

`POST /insert-file`

`POST /query`

`POST /remove`

`POST /chat`

Authentication

Development

Installation Dependencies

Running the app

Test

About

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Description

Usage

Environment Variables

Deployment

Run with Node.js

Run with Docker

API Endpoints

POST /insert

POST /insert-file

POST /query

POST /remove

POST /chat

Authentication

Development

Installation Dependencies

Running the app

Test

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Uh oh!

Contributors

Uh oh!

Languages

`POST /insert`

`POST /insert-file`

`POST /query`

`POST /remove`

`POST /chat`