Skip to content

Latest commit

 

History

History
174 lines (115 loc) · 8.57 KB

File metadata and controls

174 lines (115 loc) · 8.57 KB

Table of Contents

Forwords to China

Some have reported issues with this project when they start it in china. There might be regional blocking from openai/pinecone (21.05.2023).

Solution from jchermy

  1. : Create Cloudflare Worker, refer to noobnooc/noobnooc#9;
  2. :Proxy all the needs of openai. The example is as follows:
const embeddings = new OpenAIEmbeddings(undefined, {
  basePath: 'https://openai.1rmb.tk/v1/',
  apiKey: 'xxx',
}); 

Prerequisites

Discord Server

Join the discord if you have questions.

Project setup

Download the project

Download the project: https://github.com/mayooear/gpt4-pdf-chatbot-langchain

Install nodejs and npm

Install Node.js and npm. Official website: https://nodejs.org/en/download

Verfiy that everything is installed with the following terminal commands.

  • Your version may differ, it is only important that node has version 18 or higher.

image

Install all required packages

Open a terminal at the top level of the project and run "npm install" to install all necessary dependencies.

image

Create a ".env"

Create a file named ".env" inside your project with the following content:

OPENAI_API_KEY = 

PINECONE_API_KEY = 
PINECONE_ENVIRONMENT = 

PINECONE_INDEX_NAME = 

image

Openai api key

Go to https://platform.openai.com/overview and got "View API keys".

image

Click "+ Create new secrete key" and copy the key in your ".env" file image

Pinecone api key

Go to https://app.pinecone.io/organizations/ and create an api key there as well. Add them to the ".env" file.

image

Pinecone index

Create a new pinecone index with the Dimension 1536.

image

It should look similar to this:

image

Add pinecone index to the ".env" file

Add the information from your index correct to the .env file. At this point it should look like this:

image

(Optional) Set namespace in config/pinecone.ts

Give the namespace for your vectors a name in "config/pinecone.ts".

image

All vectors that are ingested will be available by this namespace in your pinecone index. The name you defined in "config/pinecone.ts" will be used to store your ingested data in this namespace, but also lookup your data on a question to the chatbot.

  • It is not possible to search data in multiple namespaces simultaneously.

image

Docs folder

Create a folder "docs" where you have your pdf files.

image

Ingest your pdf

Run "npm run ingest" in a terminal. This will ingest your pdf into the pinecone database. Remove the pdf document from the "docs" folder, otherwise it will be ingested a second time if you run "npm run ingest" again.

The terminal must be in the project folder at the top level. I have installed "gpt4-pdf-chatbot-langchain" in "Documents". image

Start appliation

Start the application with "npm run dev" in the terminal. Open a browser and go to the address http://localhost:3000/. You should see the project. image

Debugging

Alternative ingest with python

To trouble shoot it might be useful to take a look at my python reimplementation at ingest. It gives more control over each step of ingest.

https://github.com/ucl98/pinecone_ingest_python_implementation

Openai paid account

Check if you have a paid account. If you see the following message, you need to set up a paid account.

Openai free trial expired

You may used a free trial for the openai api. This trial expires and in this case you cannot use the openai api. For this case add a payed option. image image

Issues with GPT4 api key

Go to openai playgroud and check if you have access to GPT4. If the model is not available, you will have no access to it and need to use GPT3.5.

  • Access to the GPT4 api key does not depend on a ChatGPT subscription. You needd to apply for it.
image

Check if pinecone is working

You can make a pinecone api request on the webpage of pinecone. Go to the "Query" tab, enter the namespace, approve the namespace and then hit "query". If it returns a result, the connection to pinecone is working. image image

Api keys

Another issue can be that your api key is not working. Create a new one and test it again. image

Contribution

  • ucl
  • angelina-magidova-synder
  • chaudhary_181
  • jchermy