Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

encodeChatGenerator undefined? #15

Closed
congqi09 opened this issue Jun 3, 2023 · 16 comments
Closed

encodeChatGenerator undefined? #15

congqi09 opened this issue Jun 3, 2023 · 16 comments
Labels

Comments

@congqi09
Copy link

congqi09 commented Jun 3, 2023

encodeChat error TypeError: Cannot read properties of undefined (reading 'encodeChatGenerator')
at encodeChat
node_modules/.pnpm/gpt-tokenizer@2.1.1/node_modules/gpt-tokenizer/cjs/GptEncoding.js:141:25

@OnCloud125252
Copy link

Also facing this problem.
Maybe the author forgot to pack some external modules?

The error occurs when I execute the following code:

import { encodeChat } from "gpt-tokenizer";

const messages = [ /* Valid chat */ ];

console.log(encodeChat(messages, "gpt-3.5-turbo"));

Here's the full error:

Notice that I'm using WSL so the file directory might be a little bit weird.

file:///mnt/d/Documents/.project/Script/GPT-Tokenizer/node_modules/.pnpm/gpt-tokenizer@2.1.1/node_modules/gpt-tokenizer/esm/GptEncoding.js:138
        return [...this.encodeChatGenerator(chat, model)].flat();
                        ^

TypeError: Cannot read property 'encodeChatGenerator' of undefined
    at encodeChat (file:///mnt/d/Documents/.project/Script/GPT-Tokenizer/node_modules/.pnpm/gpt-tokenizer@2.1.1/node_modules/gpt-tokenizer/esm/GptEncoding.js:138:25)
    at file:///mnt/d/Documents/.project/Script/GPT-Tokenizer/index.js:83:13
    at ModuleJob.run (internal/modules/esm/module_job.js:183:25)
    at async Loader.import (internal/modules/esm/loader.js:178:24)
    at async Object.loadESM (internal/process/esm_loader.js:68:5)
    at async handleMainPromise (internal/modules/run_main.js:59:12)

@bennyp11
Copy link

bennyp11 commented Jun 6, 2023

I'm also facing the same issue.

@jack-corentin
Copy link

I have the same issue.

@virajrai
Copy link

virajrai commented Jun 6, 2023

Same issue.

@wgot
Copy link

wgot commented Jun 8, 2023

@zz98 This problem is related to the way you're importing and using the encodeChat method from the gpt-tokenizer module.
When you use the following import statement:

import { encodeChat } from 'gpt-tokenizer'

You're importing encodeChat as a detached function from the original object (the module). Therefore, you won't have access to the context or properties of the original object.

However, when you use the import statement as follows:

import tokenizer from 'gpt-tokenizer'
tokenizer.encodeChat([{ ... }])

You're importing the entire original object. This allows you to access the properties and methods of the original object.
The error message you're seeing:

node_modules/gpt-tokenizer/src/GptEncoding.ts:246
    return [...this.encodeChatGenerator(chat, model)].flat()

suggests that this is undefined or not what's expected in this.encodeChatGenerator(chat, model). In JavaScript, when a method is called detached from its object, this becomes undefined (in strict mode). Hence, calling the detached encodeChat method results in an error.

However, when you call encodeChat in the form of tokenizer.encodeChat, the encodeChat method is bound to the tokenizer object. Therefore, this refers to the tokenizer object, and no error occurs.

To resolve this issue, you should import the whole tokenizer and call encodeChat as a method of tokenizer. This way, this will correctly refer to tokenizer, and the error should not occur. (this answer written and translate by gpt-4.)

@amiran-gorgazjan
Copy link

@wgot Thanks for the explanation! I just had the same issue.

However, it seems the underlying problem is that class functions really shouldn't be exported like that. Even the author itself has mistaken the behavior of their own code, because the usage example in the main README.md itself would be incorrect, since it's using the unbound functions:

import {
  encode,
  encodeChat,
  decode,
  isWithinTokenLimit,
  encodeGenerator,
  decodeGenerator,
  decodeAsyncGenerator,
} from 'gpt-tokenizer'

@dbjpanda
Copy link

Same issue. Documentation needs to be updated as well.

@lox
Copy link

lox commented Jun 19, 2023

Still doesn't work:

const tokenizer = require('gpt-tokenizer')
const chatTokens = tokenizer.encodeChat(chat, 'gpt-3.5-turbo')

I get this error:

TypeError: Cannot read properties of undefined (reading 'get')
    at Object.encodeChatGenerator (xxx/node_modules/gpt-tokenizer/cjs/GptEncoding.js:99:57)
    at encodeChatGenerator.next (<anonymous>)
    at Object.encodeChat (xxx/node_modules/gpt-tokenizer/cjs/GptEncoding.js:141:25)
    at cleanup (xxx/scripts/cleanupTranscript.js:40:32)
    at Object.<anonymous> (xxx/scripts/cleanupTranscript.js:62:1)
    at Module._compile (node:internal/modules/cjs/loader:1196:14)

@guzelbspnr
Copy link

Same issue
This code works in esm but doesn't work in cjs

const tokenizer = require('gpt-tokenizer')
const chatTokens = tokenizer.encodeChat(chat, 'gpt-3.5-turbo')

this.specialTokenMapping is undefined when the encodeChatGenerator function is called in cjs.

GptEncoding this => { default: [Getter], decode: [Getter], decodeAsyncGenerator: [Getter], decodeGenerator: [Getter], encode: [Getter], encodeChat: [Getter], encodeChatGenerator: [Getter], encodeGenerator: [Getter], isWithinTokenLimit: [Getter], EndOfText: [Getter], FimPrefix: [Getter], FimMiddle: [Getter], FimSuffix: [Getter], ImStart: [Getter], ImEnd: [Getter], ImSep: [Getter], EndOfPrompt: [Getter] } this.specialTokenMapping=> undefined

TypeError: Cannot read properties of undefined (reading 'get') at Object.encodeChatGenerator (xxx/node_modules/gpt-tokenizer/cjs/GptEncoding.js:110:57) at encodeChatGenerator.next (<anonymous>) at Object.encodeChat (xxx/node_modules/gpt-tokenizer/cjs/GptEncoding.js:153:25)

@guzelbaspinar
Copy link

Same issue This code works in esm but doesn't work in cjs

const tokenizer = require('gpt-tokenizer') const chatTokens = tokenizer.encodeChat(chat, 'gpt-3.5-turbo')

this.specialTokenMapping is undefined when the encodeChatGenerator function is called in cjs.

GptEncoding this => { default: [Getter], decode: [Getter], decodeAsyncGenerator: [Getter], decodeGenerator: [Getter], encode: [Getter], encodeChat: [Getter], encodeChatGenerator: [Getter], encodeGenerator: [Getter], isWithinTokenLimit: [Getter], EndOfText: [Getter], FimPrefix: [Getter], FimMiddle: [Getter], FimSuffix: [Getter], ImStart: [Getter], ImEnd: [Getter], ImSep: [Getter], EndOfPrompt: [Getter] } this.specialTokenMapping=> undefined

TypeError: Cannot read properties of undefined (reading 'get') at Object.encodeChatGenerator (xxx/node_modules/gpt-tokenizer/cjs/GptEncoding.js:110:57) at encodeChatGenerator.next (<anonymous>) at Object.encodeChat (xxx/node_modules/gpt-tokenizer/cjs/GptEncoding.js:153:25)

Solved like this for cjs
const tokenizer = require('gpt-tokenizer').default;

@injectionsuccesfully
Copy link

injectionsuccesfully commented Jul 27, 2023

Same issue, return [...this.encodeChatGenerator(chat, model)].flat();
TypeError: Cannot read properties of undefined (reading 'encodeChatGenerator')

@sameekhan
Copy link

Same issue This code works in esm but doesn't work in cjs
const tokenizer = require('gpt-tokenizer') const chatTokens = tokenizer.encodeChat(chat, 'gpt-3.5-turbo')
this.specialTokenMapping is undefined when the encodeChatGenerator function is called in cjs.
GptEncoding this => { default: [Getter], decode: [Getter], decodeAsyncGenerator: [Getter], decodeGenerator: [Getter], encode: [Getter], encodeChat: [Getter], encodeChatGenerator: [Getter], encodeGenerator: [Getter], isWithinTokenLimit: [Getter], EndOfText: [Getter], FimPrefix: [Getter], FimMiddle: [Getter], FimSuffix: [Getter], ImStart: [Getter], ImEnd: [Getter], ImSep: [Getter], EndOfPrompt: [Getter] } this.specialTokenMapping=> undefined
TypeError: Cannot read properties of undefined (reading 'get') at Object.encodeChatGenerator (xxx/node_modules/gpt-tokenizer/cjs/GptEncoding.js:110:57) at encodeChatGenerator.next (<anonymous>) at Object.encodeChat (xxx/node_modules/gpt-tokenizer/cjs/GptEncoding.js:153:25)

Solved like this for cjs const tokenizer = require('gpt-tokenizer').default;

this fixed my issue! Thanks!

@seyfer
Copy link

seyfer commented Sep 26, 2023

@wgot Thanks for the explanation! I just had the same issue.

However, it seems the underlying problem is that class functions really shouldn't be exported like that. Even the author itself has mistaken the behavior of their own code, because the usage example in the main README.md itself would be incorrect, since it's using the unbound functions:

import {
  encode,
  encodeChat,
  decode,
  isWithinTokenLimit,
  encodeGenerator,
  decodeGenerator,
  decodeAsyncGenerator,
} from 'gpt-tokenizer'

@niieani this is a valid point, please adjust the documentation for encodeChat usage.

@niieani
Copy link
Owner

niieani commented Sep 28, 2023

Apologies folks, the documentation was co-written by ChatGPT and I've missed this when doing a manual review. 😄
I'll make a fix soon.

@niieani niieani closed this as completed in 86c270c Oct 7, 2023
@niieani
Copy link
Owner

niieani commented Oct 7, 2023

Wasn't a documentation issue after all, just forgot to bind the functions. Should be fixed in next version.

@github-actions
Copy link

github-actions bot commented Oct 7, 2023

🎉 This issue has been resolved in version 2.1.2 🎉

The release is available on:

Your semantic-release bot 📦🚀

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests