Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Failed to calculate number of tokens with tiktoken, falling back to approximate count #33

Closed
braco opened this issue Apr 26, 2023 · 2 comments

Comments

@braco
Copy link

braco commented Apr 26, 2023

I'm running langchainjs with its default summarizer, in a loop over different documents. tiktoken seems to start producing this error at some point, and closing / reopening the process eliminates the error.

Failed to calculate number of tokens with tiktoken, falling back to approximate count RuntimeError: unreachable
    at wasm://wasm/00b5f812:wasm-function[563]:0x6a72a
    at wasm://wasm/00b5f812:wasm-function[665]:0x6fd7a
    at wasm://wasm/00b5f812:wasm-function[756]:0x70f7f
    at wasm://wasm/00b5f812:wasm-function[237]:0x5c43a
    at wasm://wasm/00b5f812:wasm-function[200]:0x4db89
    at wasm://wasm/00b5f812:wasm-function[34]:0x1f78a
    at wasm://wasm/00b5f812:wasm-function[159]:0x48dc3
    at Tiktoken.encode (/project/node_modules/@dqbd/tiktoken/tiktoken_bg.cjs:262:18)
    at OpenAIChat.getNumTokens (file:///project/node_modules/langchain/dist/base_language/index.js:80:44)
    at process.processTicksAndRejections (node:internal/process/task_queues:95:5)

Failed to calculate number of tokens with tiktoken, falling back to approximate count Error: S@/b
    at module.exports.__wbindgen_error_new (/project/node_modules/@dqbd/tiktoken/tiktoken_bg.cjs:410:17)
    at wasm://wasm/00b5f812:wasm-function[59]:0x29389
    at module.exports.encoding_for_model (/project/node_modules/@dqbd/tiktoken/tiktoken_bg.cjs:177:14)
    at OpenAIChat.getNumTokens (file:///project/node_modules/langchain/dist/base_language/index.js:70:38)
    at async Promise.all (index 1104)
    at async MapReduceDocumentsChain._call (file:///project/node_modules/langchain/dist/chains/combine_docs_chain.js:154:28)
    at async MapReduceDocumentsChain.call (file:///project/node_modules/langchain/dist/chains/base.js:50:28)
    at async summarizer (file:///project/lib/gpt.mjs:179:20)
❯ node --version                        
v18.12.1
❯ yarn why @dqbd/tiktoken
=> Found "@dqbd/tiktoken@1.0.6"

Also filed in langchainjs, not sure where the issue is:
langchain-ai/langchainjs#1009

@dqbd
Copy link
Owner

dqbd commented Apr 28, 2023

Hi @braco!

Thank you for the report, it does seem like langchain keeps reinstatiating the encoder without freeing it. Will look into further into it after some time.

@dqbd
Copy link
Owner

dqbd commented May 15, 2023

In general, langchain will be fixed in langchain-ai/langchainjs#1239 by replacing WASM package with the JS port.

The note with .free() still should be valid for the WASM package. Closing for now!

@dqbd dqbd closed this as completed May 15, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants