Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pyinstaller has some bug that results in improper packaging of tiktoken #43

Closed
bofinbabu opened this issue Mar 1, 2023 · 22 comments
Closed

Comments

@bofinbabu
Copy link

bofinbabu commented Mar 1, 2023

What could be the fix for this error. I am trying out the library for the first time.

import tiktoken
enc = tiktoken.get_encoding("gpt2")
assert enc.decode(enc.encode("hello world")) == "hello world"
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
Input In [47], in <cell line: 2>()
      1 import tiktoken
----> 2 enc = tiktoken.get_encoding("gpt2")
      3 assert enc.decode(enc.encode("hello world")) == "hello world"

File ~/work/p3ds/lib/python3.10/site-packages/tiktoken/registry.py:60, in get_encoding(encoding_name)
     57     assert ENCODING_CONSTRUCTORS is not None
     59 if encoding_name not in ENCODING_CONSTRUCTORS:
---> 60     raise ValueError(f"Unknown encoding {encoding_name}")
     62 constructor = ENCODING_CONSTRUCTORS[encoding_name]
     63 enc = Encoding(**constructor())

ValueError: Unknown encoding gpt2


@hauntsaninja
Copy link
Collaborator

How did you install tiktoken?

@shirubei
Copy link

shirubei commented Mar 4, 2023

Maybe similar case here.
I compiled wechatGPT_Turbo.py to make it executable using pyinstaller under windows 10.
When I run the executable directly , it showed error message listed below.

C:\Users\Administrator\Downloads>wechatGPT_Turbo.exe
Traceback (most recent call last):
File "wechatGPT_Turbo.py", line 13, in
from revChatGPT_Turbo import Chatbot as Turbot
File "", line 1027, in _find_and_load
File "", line 1006, in _find_and_load_unlocked
File "", line 688, in _load_unlocked
File "PyInstaller\loader\pyimod02_importers.py", line 352, in exec_module
File "revChatGPT_Turbo.py", line 17, in
ENCODER = tiktoken.get_encoding("gpt2")
File "tiktoken\registry.py", line 60, in get_encoding
ValueError: Unknown encoding gpt2

btw, tiktoken was installed via "pip3 install tiktoken" and imported to revChatGPT_Turbo.py as below:
import tiktoken

But when I run "python wechatGPT_Turbo.py ", everything was OK.
Any suggestion is appreciated. Thank you!

@Jeremy-ttt
Copy link

Same question as above.When I made it executable,the question came out.

@hauntsaninja
Copy link
Collaborator

I haven't ever used pyinstaller, sounds like there's a bug in it? The tiktoken distribution on PyPI contains two packages, tiktoken and tiktoken_ext and needs both of them for tiktoken.get_encoding("gpt2") to work.

Maybe see if pyinstaller people know what the issue is. I'm willing to make minor adjustments to how tiktoken specifies packaging metadata to support the use case.

@shirubei
Copy link

shirubei commented Mar 6, 2023

I haven't ever used pyinstaller, sounds like there's a bug in it? The tiktoken distribution on PyPI contains two packages, tiktoken and tiktoken_ext and needs both of them for tiktoken.get_encoding("gpt2") to work.

Maybe see if pyinstaller people know what the issue is. I'm willing to make minor adjustments to how tiktoken specifies packaging metadata to support the use case.

Thank you for response. Seems there's a bug in pyinstaller. I'll open an issues there.

@shirubei
Copy link

shirubei commented Mar 6, 2023

I'm willing to make minor adjustments to how tiktoken specifies packaging metadata to support the use case.

Have no idea of package metadata. But pyinstaller do have an option --copy-metadata PACKAGENAME . If minor changes are made, I'm glad to make a try and feed back. Thank you.

@Jeremy-ttt
Copy link

Jeremy-ttt commented Mar 6, 2023

I have solved it using methods below:

1.Add
--hidden-import=tiktoken_ext.openai_public --hidden-import=tiktoken_ext
when you use pyinstaller to make it executable.

2.delete the code
with open(os.path.join(_SCRIPT_DIR, "VERSION")) as _version_file: __version__ = _version_file.read().strip()
in module "blobfile" __init__.py

Hope it works on you.

@shirubei
Copy link

shirubei commented Mar 6, 2023

I have solved it using methods below:

1.Add --hidden-import=tiktoken_ext.openai_public --hidden-import=tiktoken_ext when you use pyinstaller to make it executable.

2.delete the code with open(os.path.join(_SCRIPT_DIR, "VERSION")) as _version_file: __version__ = _version_file.read().strip() in module "blobfile" __init__.py

Many thanks!
After thses stpes, ran into another error 'Could not find module 'C:\Users{MYNAME}\AppData\Local\Temp_MEI160522\tls_client\dependencies\tls-client-64.dll'.

I created a dir named dll and put tls-client-64.dll in it, then added option below and finally solved the problem.
--add-binary "dll\tls-client-64.dll;tls_client/dependencies"

@hauntsaninja hauntsaninja changed the title Not able to use gpt2 encoder: Error pyinstaller has some bug that results in improper packaging of tiktoken Mar 6, 2023
@hauntsaninja
Copy link
Collaborator

It looks like some of the issue here is the blobfile dependency. Most people won't need that; I can make that an optional dependency.

@hauntsaninja
Copy link
Collaborator

I've made blobfile an optional dependency in 0.3.1.

Based on Jeremy-ttt's message, it sounds like the rest of this can be handled by pyinstaller's --hidden-import.

Let me know if there's anything else I can do here, if not, I'll close this issue soon.

@ManlyMoustache
Copy link

I have solved it using methods below:

1.Add --hidden-import=tiktoken_ext.openai_public --hidden-import=tiktoken_ext when you use pyinstaller to make it executable.

2.delete the code with open(os.path.join(_SCRIPT_DIR, "VERSION")) as _version_file: __version__ = _version_file.read().strip() in module "blobfile" __init__.py

Hope it works on you.

This answer prevented me from going totally loco. Worked like a charm, I was trying --hidden-import method for tiktoken_ext but not for --hidden-import=tiktoken_ext.openai_public and this seems to fixed the issue completely!

Thanks a lot!

@MysticDragonfly
Copy link

I have solved it using methods below:我已经使用以下方法解决了它:

1.Add --hidden-import=tiktoken_ext.openai_public --hidden-import=tiktoken_ext when you use pyinstaller to make it executable.当您使用 pyinstaller 使其可执行时。

2.delete the code 2.删除代码 with open(os.path.join(_SCRIPT_DIR, "VERSION")) as _version_file: __version__ = _version_file.read().strip() in module "blobfile" __init__.py在模块“blobfile” __init__.py

Hope it works on you. 希望它对你有用。

thank you very much! I solve my problem!

@MikkoHaavisto
Copy link

To clarify, just adding the hidden imports mentioned in 1. fixed the bug for me and allowed the exefication of py.

@hauntsaninja
Copy link
Collaborator

If a comment fixed the issue for you, please show your appreciation via emoji reactions instead of commenting :-)

@juanmillans
Copy link

I ran into the same issue, Using Auto-py-to-exe did you find a way to solve it?

@juanmillans
Copy link

juanmillans commented Apr 5, 2023

I just solved that shit!, look you have to use auto py installer and add manually every Fuking single library, ticktoken and ticktoken_ext as one of the guys up in the comments said. Im sohappy it worked !!!! by the way you have to use Auto-py-to-ext for it to work, or type the code yourself in pyinstaller which certainly will be a pain in the ass.
image

@hauntsaninja
Copy link
Collaborator

Closing, since there's nothing for tiktoken to do here. I added a mention to this in an FAQ issue: #98

@octimot
Copy link

octimot commented Apr 12, 2023

This still seems to be an issue on my end, when trying to include this package via .spec file instead of --hidden-import CLI argument

@shirubei Did you open this issue on pyinstaller/issues ? I can't seem to find it.

@hauntsaninja
Technically, importing tiktoken_ext should also include tiktoken_ext.openai_public so there's still something missing here... Maybe pyinstaller needs the init.py file for tiktoken_ext so that it knows that it's a package - as per Python manual?

Cheers!

@hauntsaninja
Copy link
Collaborator

hauntsaninja commented Apr 12, 2023

Doesn't need the __init__.py, tiktoken_ext is a namespace package. We use this to allow extensibility, e.g. see https://github.com/openai/tiktoken#extending-tiktoken

@octimot
Copy link

octimot commented Apr 12, 2023

Got it, thanks!

I see there's a way to hook namespace packages according to Pyinstaller.

I'll dig more and try to figure out what's going on, Maybe it's something weird on on my machine...

@octimot
Copy link

octimot commented Apr 12, 2023

Never mind! I found the issue in my .spec file.

To anyone else that is as clumsy as I am when not using the command line arguments with pyinstaller, the proper way to include hidden imports via .spec file is to append them directly to the hiddenimports list.

In other words, just add this (preferably after hiddenimports = []):

# add tiktoken_ext to hidden imports

hiddenimports.append('tiktoken_ext')

hiddenimports.append('tiktoken_ext.openai_public')

@Lucienxhh
Copy link

A simplier solution.
Just add these lines to your code which imports tiktoken.

from tiktoken_ext import openai_public
import tiktoken_ext

gjreda added a commit to refstudio/refstudio that referenced this issue Sep 12, 2023
gjreda added a commit to refstudio/refstudio that referenced this issue Sep 13, 2023
* Adopt litellm for rewrite - phase 1

* Adopt litellm

* Improve OpenAiSettingsPane showing options

* Code cleanup

* tests pass

* fix

* remove unused dependency

* Add necessary hidden-imports (see: openai/tiktoken#43)

* Add workflow to build and test sidecar binary

* update docstring

* only when specific files are changed

* only when specific files are changed

* Fix lint

* generalize settings for multiple model providers

* codegen

* Adopt new settings for AI

* Remove unsupported option for ollama

* Update model when provider is updated

* Bring retry back!

* Configure reraise=True for @Retry

---------

Co-authored-by: Greg Reda <gjreda+github@gmail.com>
This issue was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

10 participants