Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error: Failed to ingest your data #60

Closed
yaya6630170 opened this issue Mar 26, 2023 · 36 comments
Closed

Error: Failed to ingest your data #60

yaya6630170 opened this issue Mar 26, 2023 · 36 comments

Comments

@yaya6630170
Copy link

I extracted parts of the code except for the pdf content. Please I just can't find the bug.....

@yaya6630170
Copy link
Author

error [Error: Network Error] {
config: {
transitional: {
silentJSONParsing: true,
forcedJSONParsing: true,
clarifyTimeoutError: false
},
adapter: [AsyncFunction: fetchAdapter],
transformRequest: [ [Function: transformRequest] ],
transformResponse: [ [Function: transformResponse] ],
timeout: 0,
xsrfCookieName: 'XSRF-TOKEN',
xsrfHeaderName: 'X-XSRF-TOKEN',
maxContentLength: -1,
maxBodyLength: -1,
validateStatus: [Function: validateStatus],
headers: {
Accept: 'application/json, text/plain, /',
'Content-Type': 'application/json',
'User-Agent': 'OpenAI/NodeJS/3.2.1',
Authorization: 'Bearer sk-XXXXX'(here is the openai api key)
},
method: 'post',

url: 'https://api.openai.com/v1/embeddings'
},
code: 'ERR_NETWORK',
request: Request {
[Symbol(realm)]: { settingsObject: [Object] },
[Symbol(state)]: {
method: 'POST',
localURLsOnly: false,
unsafeRequest: false,
body: [Object],
client: [Object],
reservedClient: null,
replacesClientId: '',
window: 'client',
keepalive: false,
serviceWorkers: 'all',
initiator: '',
destination: '',
priority: null,
origin: 'client',
policyContainer: 'client',
referrer: 'client',
referrerPolicy: '',
mode: 'cors',
useCORSPreflightFlag: false,
credentials: 'same-origin',
useCredentials: false,
cache: 'default',
redirect: 'follow',
integrity: '',
cryptoGraphicsNonceMetadata: '',
parserMetadata: '',
reloadNavigation: false,
historyNavigation: false,
userActivation: false,
taintedOrigin: false,
redirectCount: 0,
responseTainting: 'basic',
preventNoCacheCacheControlHeaderModification: false,
done: false,
timingAllowFailed: false,
headersList: [HeadersList],
urlList: [Array],
url: [URL]
},
[Symbol(signal)]: AbortSignal { aborted: false },
[Symbol(headers)]: HeadersList {
cookies: null,
[Symbol(headers map)]: [Map],
[Symbol(headers map sorted)]: null
}
},
response: undefined,
isAxiosError: true,
toJSON: [Function: toJSON]
}
d:\Documents\GitHub\gpt4-pdf-chatbot-langchain\scripts\ingest-data.ts:53
throw new Error('Failed to ingest your data');
^

[Error: Failed to ingest your data]

Node.js v18.14.2
 ELIFECYCLE  Command failed with exit code 1.

@text2sql
Copy link

fixed it already. thanks

@louis-sanna-eki
Copy link

@texttosql how did you fix it?

@text2sql
Copy link

i 've spent hours 1) trying to understand where the problem is, 2) consulting with chat gpt :-) , and 3) tailoring the code here and there. i 've also updated several packages. to be very honest, i am not 100 % sure what helped exactly. as a first step, i 'd recommend you install VS studio , take a look at problems tab (it would say in which part of the code there is an issue), make sure to install all packages as per Mayo instructions. I think most important is to make sure PineCone client is up to date. I hope this helps.

@text2sql
Copy link

sorry, another first step is to look at the console and see if the text is actually splitting and embedding works. if yes, than the issue is likely with pinecone client

@text2sql
Copy link

another one i forgot to mention - make sure to change the gpt model if you don't have access to gpt4 yet. i've changed mine to gpt3.5turbo

@louis-sanna-eki
Copy link

louis-sanna-eki commented Mar 26, 2023

@texttosql thx! I have a very strange "PineconeClient: Project name not set. Call init() first." error, despite the official doc no requiring it.

https://www.npmjs.com/package/@pinecone-database/pinecone

Screenshot 2023-03-26 at 16 21 02

@text2sql
Copy link

i thought we are initializing it somewhere. my best advice - ask chatgpt about this particular error. and also look at the 'problems' tab in VS

@text2sql
Copy link

did pinecone-client install ok?

@louis-sanna-eki
Copy link

louis-sanna-eki commented Mar 26, 2023

The error is know, but none of the fix work for me (updating node, pining lib to 0.0.10)

pinecone-io/pinecone-ts-client#12

EDIT: I managed to get a new error by adding the projectName directly on the object

Screenshot 2023-03-26 at 16 35 00

EDIT2: new error
Screenshot 2023-03-26 at 16 37 00

@louis-sanna-eki
Copy link

louis-sanna-eki commented Mar 26, 2023

So I finally managed to make it work.

The pinecode lib use the projectName to build the url, so you have to set on the object. The projectName can be found in the pinecode web UI.

// Pinecode lib
Screenshot 2023-03-26 at 17 07 21
// Pinecode interface with name
Screenshot 2023-03-26 at 17 07 53
// Your code where you set the projectName
Screenshot 2023-03-26 at 17 07 32

I have no idea why it works for everyone else.

@mayooear
Copy link
Owner

another one i forgot to mention - make sure to change the gpt model if you don't have access to gpt4 yet. i've changed mine to gpt3.5turbo

This is a major cause of issues. Many people attempt to use gpt-4 when they don't yet have access.

@mayooear
Copy link
Owner

So I finally managed to make it work.

The pinecode lib use the projectName to build the url, but so you have to set on the object. The projectName can be found in the pinecode web UI.

// Pinecode lib Screenshot 2023-03-26 at 17 07 21 // Pinecode interface with name Screenshot 2023-03-26 at 17 07 53 // Your code where you set the projectName Screenshot 2023-03-26 at 17 07 32

I have no idea why it works for everyone else.

strange why it doesn't work for you without setting a projectname.

@yaya6630170
Copy link
Author

另一个我忘记提到的 - 如果您还没有访问 gpt4,请确保更改 gpt 模型。我已经把我的改成了 gpt3.5turbo

这是造成问题的主要原因。许多人在还没有访问权限时尝试使用 gpt-4。

已经是gpt3.5turbo了 T T文本也在分段,确实不明白是什么问题,pinecode需要修改么?我看他们上面提到了pinecone-client,是需要装这个?

@felipeotarola
Copy link

So I finally managed to make it work.

The pinecode lib use the projectName to build the url, so you have to set on the object. The projectName can be found in the pinecode web UI.

// Pinecode lib Screenshot 2023-03-26 at 17 07 21 // Pinecode interface with name Screenshot 2023-03-26 at 17 07 53 // Your code where you set the projectName Screenshot 2023-03-26 at 17 07 32

I have no idea why it works for everyone else.

Awesome thanks for this solution I was also experiencing this, adding the pinecone.projectName to the initPinectode function got me to the next problem, that was causing by the basePath in the pinecone library, the concat didn't work for my url so I just hardcoded and it worked.

@okmike
Copy link

okmike commented Mar 28, 2023

error [Error: Network Error] { config: { transitional: { silentJSONParsing: true, forcedJSONParsing: true, clarifyTimeoutError: false }, adapter: [AsyncFunction: fetchAdapter], transformRequest: [ [Function: transformRequest] ], transformResponse: [ [Function: transformResponse] ], timeout: 0, xsrfCookieName: 'XSRF-TOKEN', xsrfHeaderName: 'X-XSRF-TOKEN', maxContentLength: -1, maxBodyLength: -1, validateStatus: [Function: validateStatus], headers: { Accept: 'application/json, text/plain, /', 'Content-Type': 'application/json', 'User-Agent': 'OpenAI/NodeJS/3.2.1', Authorization: 'Bearer sk-XXXXX'(here is the openai api key) }, method: 'post',

url: 'https://api.openai.com/v1/embeddings' }, code: 'ERR_NETWORK', request: Request { [Symbol(realm)]: { settingsObject: [Object] }, [Symbol(state)]: { method: 'POST', localURLsOnly: false, unsafeRequest: false, body: [Object], client: [Object], reservedClient: null, replacesClientId: '', window: 'client', keepalive: false, serviceWorkers: 'all', initiator: '', destination: '', priority: null, origin: 'client', policyContainer: 'client', referrer: 'client', referrerPolicy: '', mode: 'cors', useCORSPreflightFlag: false, credentials: 'same-origin', useCredentials: false, cache: 'default', redirect: 'follow', integrity: '', cryptoGraphicsNonceMetadata: '', parserMetadata: '', reloadNavigation: false, historyNavigation: false, userActivation: false, taintedOrigin: false, redirectCount: 0, responseTainting: 'basic', preventNoCacheCacheControlHeaderModification: false, done: false, timingAllowFailed: false, headersList: [HeadersList], urlList: [Array], url: [URL] }, [Symbol(signal)]: AbortSignal { aborted: false }, [Symbol(headers)]: HeadersList { cookies: null, [Symbol(headers map)]: [Map], [Symbol(headers map sorted)]: null } }, response: undefined, isAxiosError: true, toJSON: [Function: toJSON] } d:\Documents\GitHub\gpt4-pdf-chatbot-langchain\scripts\ingest-data.ts:53 throw new Error('Failed to ingest your data'); ^

[Error: Failed to ingest your data]

Node.js v18.14.2  ELIFECYCLE  Command failed with exit code 1.

Exact same problem here. Keep getting the Network Error message. But I have Clash proxy running at the same time, not sure if the Clash casues network conflicts.

@okmike
Copy link

okmike commented Mar 28, 2023

I tried to change Clash proxy mode to Global、Rule、Direct multiple times, and the Network Error message keeps appearing.

@yudidina
Copy link

I tried to change Clash proxy mode to Global、Rule、Direct multiple times, and the Network Error message keeps appearing.

Me, too, damn

@oashua
Copy link

oashua commented Mar 29, 2023

I met a similar problem about Failed to ingest your data, but the original error is fromerror [Error: Network Error] after creating vector store...
image
I promise the proxy setting is right(both for bash and npm)

@okmike
Copy link

okmike commented Mar 29, 2023

Problem solved. For those who use Clash and VS code the same time, do the following to check the results. I am not sure which step is necessary, but it works for me anyway.

  1. Add pinecone.projectName to your file.
    image
  2. Turn on the Clash TUN mode.
    image
  3. Change Clash to Global, make sure the proxy address has access to the openai website.
  4. Final pinecone website result:
    image

@okmike
Copy link

okmike commented Mar 29, 2023

另一个我忘记提到的 - 如果您还没有访问 gpt4,请确保更改 gpt 模型。我已经把我的改成了 gpt3.5turbo

这是造成问题的主要原因。许多人在还没有访问权限时尝试使用 gpt-4。

已经是gpt3.5turbo了 T T文本也在分段,确实不明白是什么问题,pinecode需要修改么?我看他们上面提到了pinecone-client,是需要装这个?

如果用Clash科学上网的话,可以看下我的回复

@oashua
Copy link

oashua commented Mar 29, 2023

Problem solved. For those who use Clash and VS code the same time, do the following to check the results. I am not sure which step is necessary, but it works for me anyway.

  1. Add pinecone.projectName to your file.
    image
  2. Turn on the Clash TUN mode.
    image
  3. Change Clash to Global, make sure the proxy address has access to the openai website.
  4. Final pinecone website result:
    image

网上看到过设置tun mode的,我设置后节点列表就不见了,需要重新卸载掉service mode才可以 设置tun mode的方法也是千奇百怪。。。

@yaya6630170
Copy link
Author

另一个我忘记提到的 - 如果您还没有访问 gpt4,请确保更改 gpt 模型。我已经把我的改成了 gpt3.5turbo

这是造成问题的主要原因。许多人在还没有访问权限时尝试使用 gpt-4。

已经是gpt3.5turbo了 T T文本也在分段,确实不明白是什么问题,pinecode需要修改么?我看他们上面提到了pinecone-client,是需要装这个?

如果用Clash科学上网的话,可以看下我的回复

awsome, seems i can get to pinecone, but new problem~~~~creating vector store...
error [TypeError: documents.map is not a function]
d:\Documents\GitHub\gpt4-pdf-chatbot-langchain\scripts\ingest-data.ts:44
throw new Error('Failed to ingest your data');
^

[Error: Failed to ingest your data]

@zina0
Copy link

zina0 commented Mar 29, 2023

我也遇到这个问题,也有科学上网了,但是不行,显然pinecone那边也没办法读入数据,有解决的话麻烦说一下

@okmike
Copy link

okmike commented Mar 29, 2023

这我就不清楚了,我是直接把TUN mode开关打开就行了。然后把UWP loopback里能选的都选上了。

@okmike
Copy link

okmike commented Mar 29, 2023

另一个我忘记提到的 - 如果您还没有访问 gpt4,请确保更改 gpt 模型。我已经把我的改成了 gpt3.5turbo

这是造成问题的主要原因。许多人在还没有访问权限时尝试使用 gpt-4。

已经是gpt3.5turbo了 T T文本也在分段,确实不明白是什么问题,pinecode需要修改么?我看他们上面提到了pinecone-client,是需要装这个?

如果用Clash科学上网的话,可以看下我的回复

awsome, seems i can get to pinecone, but new
problem~~~~creating vector store... error [TypeError: documents.map is not a function] d:\Documents\GitHub\gpt4-pdf-chatbot-langchain\scripts\ingest-data.ts:44 throw new Error('Failed to ingest your data'); ^

[Error: Failed to ingest your data]

试试安装pico-client?

另一个我忘记提到的 - 如果您还没有访问 gpt4,请确保更改 gpt 模型。我已经把我的改成了 gpt3.5turbo

这是造成问题的主要原因。许多人在还没有访问权限时尝试使用 gpt-4。

已经是gpt3.5turbo了 T T文本也在分段,确实不明白是什么问题,pinecode需要修改么?我看他们上面提到了pinecone-client,是需要装这个?

如果用Clash科学上网的话,可以看下我的回复

awsome, seems i can get to pinecone, but new problem~~~~creating vector store... error [TypeError: documents.map is not a function] d:\Documents\GitHub\gpt4-pdf-chatbot-langchain\scripts\ingest-data.ts:44 throw new Error('Failed to ingest your data'); ^

[Error: Failed to ingest your data]

试试安装pico-client?

@GrantRomero
Copy link

So I finally managed to make it work.
The pinecode lib use the projectName to build the url, but so you have to set on the object. The projectName can be found in the pinecode web UI.
// Pinecode lib Screenshot 2023-03-26 at 17 07 21 // Pinecode interface with name Screenshot 2023-03-26 at 17 07 53 // Your code where you set the projectName Screenshot 2023-03-26 at 17 07 32
I have no idea why it works for everyone else.

strange why it doesn't work for you without setting a projectname.

I addded the project name and am getting this error still
Capture

@hipnologo
Copy link

I am facing similar problem; after running npm run ingest and get a successful message, I checked Pinecone index info and got a namespace "books" with a total vectors of 309. However, after initializing the app pnpm run dev and typing a question, I see the error below in the logs and in the client page an empty response.

PineconeClient: Error getting project name: TypeError: fetch failed
error - [Error: PineconeClient: Project name not set. Call init() first.] {
  page: '/api/chat'
}
wait  - compiling /_error (client and server)...

Pinecode and config files are matching README instructions.

@naticio
Copy link

naticio commented Apr 2, 2023

same error for me...
image

@naticio
Copy link

naticio commented Apr 2, 2023

initPinectode

thanks, I did this but now I get a new error :(
Error: PineconeClient: Error calling upsertRaw: FetchError: The request failed and the interceptors did not return an alternative response]

@stephanmingoes
Copy link

image

Is anyone else getting status code 429 aka "too many requests"?

@Dasheverless
Copy link

image

Is anyone else getting status code 429 aka "too many requests"?

you can use python chatgpt api demo to check your problem. I fix this issue by setting up a payment method in my openai api page.

@cklingspor
Copy link

With regard to the 429. Check out here.

The reason was that I created my API key BEFORE converting my OpenAI account to paid (adding credit card). Doesn't matter if you only upgrade, you also need to create a new api key entirely. I created another API key AFTER I added my credit card and it worked fine!

This helped me as well

@larri-eng
Copy link

I am getting the same error. Any insights?

request: Request {
[Symbol(realm)]: { settingsObject: [Object] },
[Symbol(state)]: {
method: 'POST',
localURLsOnly: false,
unsafeRequest: false,
body: [Object],
client: [Object],
reservedClient: null,
replacesClientId: '',
window: 'client',
keepalive: false,
serviceWorkers: 'all',
initiator: '',
destination: '',
priority: null,
origin: 'client',
policyContainer: 'client',
referrer: 'client',
referrerPolicy: '',
mode: 'cors',
useCORSPreflightFlag: false,
credentials: 'same-origin',
useCredentials: false,
cache: 'default',
redirect: 'follow',
integrity: '',
cryptoGraphicsNonceMetadata: '',
parserMetadata: '',
reloadNavigation: false,
historyNavigation: false,
userActivation: false,
taintedOrigin: false,
redirectCount: 0,
responseTainting: 'basic',
preventNoCacheCacheControlHeaderModification: false,
done: false,
timingAllowFailed: false,
headersList: [HeadersList],
urlList: [Array],
url: [URL]
},
[Symbol(signal)]: AbortSignal { aborted: false },
[Symbol(headers)]: HeadersList {
cookies: null,
[Symbol(headers map)]: [Map],
[Symbol(headers map sorted)]: null
}
},
response: {
ok: false,
status: 401,
statusText: 'Unauthorized',
headers: HeadersList {
cookies: null,
[Symbol(headers map)]: [Map],
[Symbol(headers map sorted)]: null
},
config: {
transitional: [Object],
adapter: [AsyncFunction: fetchAdapter],
transformRequest: [Array],
transformResponse: [Array],
timeout: 0,
xsrfCookieName: 'XSRF-TOKEN',
xsrfHeaderName: 'X-XSRF-TOKEN',
maxContentLength: -1,
maxBodyLength: -1,
validateStatus: [Function: validateStatus],
headers: [Object],
method: 'post',

@dosubot
Copy link

dosubot bot commented Sep 24, 2023

Hi, @yaya6630170! I'm Dosu, and I'm helping the gpt4-pdf-chatbot-langchain team manage their backlog. I wanted to let you know that we are marking this issue as stale.

From what I understand, you opened the issue titled "Error: Failed to ingest your data" because you were experiencing an error while trying to extract code from a PDF. Other users have provided suggestions and solutions, such as updating packages, installing VS Studio, and setting the projectName in the pinecone library. There was also a discussion about receiving a status code 429 (too many requests), with suggestions to check the OpenAI API key and payment method.

Before we close this issue, we wanted to check if it is still relevant to the latest version of the gpt4-pdf-chatbot-langchain repository. If it is, please let us know by commenting on the issue. Otherwise, feel free to close the issue yourself, or it will be automatically closed in 7 days.

Thank you for your understanding, and please don't hesitate to reach out if you have any further questions or concerns.

@dosubot dosubot bot added the stale Issue has not had recent activity or appears to be solved. Stale issues will be automatically closed label Sep 24, 2023
@dosubot
Copy link

dosubot bot commented Sep 24, 2023

Hi, @yaya6630170! I'm Dosu, and I'm helping the gpt4-pdf-chatbot-langchain team manage their backlog. I wanted to let you know that we are marking this issue as stale.

From what I understand, you opened the issue titled "Error: Failed to ingest your data" because you were experiencing an error while trying to extract code from a PDF. Other users have provided suggestions and solutions, such as updating packages, installing VS Studio, and setting the projectName in the pinecone library. There was also a discussion about receiving a status code 429 (too many requests), with suggestions to check the OpenAI API key and payment method.

Before we close this issue, we wanted to check if it is still relevant to the latest version of the gpt4-pdf-chatbot-langchain repository. If it is, please let us know by commenting on the issue. Otherwise, feel free to close the issue yourself, or it will be automatically closed in 7 days.

Thank you for your understanding, and please don't hesitate to reach out if you have any further questions or concerns.

@dosubot dosubot bot closed this as not planned Won't fix, can't repro, duplicate, stale Oct 1, 2023
@dosubot dosubot bot removed the stale Issue has not had recent activity or appears to be solved. Stale issues will be automatically closed label Oct 1, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests