Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG-REPORT] Exception in pyinstaller bundled app for vaex >=4.6.0 #1823

Open
schwingkopf opened this issue Jan 12, 2022 · 8 comments
Open

Comments

@schwingkopf
Copy link

Description

I'm facing two exceptions when using latest vaex versions (4.6.0 and 4.7.0) after bundling using pyinstaller 4.7.

First exception

Traceback (most recent call last):
  File "main.py", line 1, in <module>
  File "<frozen importlib._bootstrap>", line 983, in _find_and_load
  File "<frozen importlib._bootstrap>", line 967, in _find_and_load_unlocked
  File "<frozen importlib._bootstrap>", line 677, in _load_unlocked
  File "PyInstaller\loader\pyimod03_importers.py", line 476, in exec_module
  File "vaex\__init__.py", line 43, in <module>
  File "<frozen importlib._bootstrap>", line 983, in _find_and_load
  File "<frozen importlib._bootstrap>", line 967, in _find_and_load_unlocked
  File "<frozen importlib._bootstrap>", line 677, in _load_unlocked
  File "PyInstaller\loader\pyimod03_importers.py", line 476, in exec_module
  File "vaex\dataset.py", line 13, in <module>
  File "<frozen importlib._bootstrap>", line 983, in _find_and_load
  File "<frozen importlib._bootstrap>", line 967, in _find_and_load_unlocked
  File "<frozen importlib._bootstrap>", line 677, in _load_unlocked
  File "PyInstaller\loader\pyimod03_importers.py", line 476, in exec_module
  File "frozendict\__init__.py", line 22, in <module>
FileNotFoundError: [Errno 2] No such file or directory: 'D:\\git_repos\\pyinstaller_problem2_minmal\\dist\\main\\frozendict\\VERSION'

It's caused by the VERSION file of frozendict (new in versions >2.0) not being bundled. That's actually a pyinstaller/frozendict issue. I just wanted to post the solution here as others will likely face the same issue. It can be solved by using the following hook file:

hook-frozendict.py:

from pathlib import Path
import frozendict

datas = [(Path(frozendict.__path__[0]) / 'VERSION', 'frozendict')]

Second exception

Hello world
Traceback (most recent call last):
  File "main.py", line 6, in <module>
  File "vaex\dataframe.py", line 928, in count
  File "vaex\dataframe.py", line 902, in _compute_agg
  File "vaex\dataframe.py", line 1672, in _delay
  File "vaex\dataframe.py", line 412, in execute
  File "vaex\execution.py", line 181, in execute
  File "vaex\execution.py", line 186, in run
  File "vaex\asyncio.py", line 51, in just_run
  File "nest_asyncio.py", line 81, in run_until_complete
  File "asyncio\futures.py", line 181, in result
  File "asyncio\tasks.py", line 249, in __step
  File "vaex\execution.py", line 334, in execute_async
  File "vaex\memory.py", line 37, in create_tracker
ValueError: No memory tracker found with name default
[1272] Failed to execute script 'main' due to unhandled exception!

For this one I have not found a solution yet and would like to query help. I have trouble understanding how the embedded importing in vaex/memory.py works (and I guess so does pyinstaller). Any hints how to solve this?
Thats the concerned code section from vaex/memory.py

def create_tracker():
    memory_tracker_type = vaex.settings.main.memory_tracker.type
    if not _memory_tracker_types:
        with lock:
            if not _memory_tracker_types:
                for entry in pkg_resources.iter_entry_points(group="vaex.memory.tracker"):
                    _memory_tracker_types[entry.name] = entry.load()
    cls = _memory_tracker_types.get(memory_tracker_type)
    if cls is not None:
        return cls()
    raise ValueError(f"No memory tracker found with name {memory_tracker_type}")

Steps to reproduce are the following:

main.py:

import vaex

print("Hello world")

df = vaex.from_dict({'A':[1,2,3]})
print(df.count())

Executing python main.py, the script runs fine.

Bundle using pyinstaller 4.7 (having above mentioned hook-frozendict.py):
pyinstaller --onedir --additional-hooks-dir=. main.py

Output of main.exe is:

Hello world
Traceback (most recent call last):
  File "main.py", line 6, in <module>
  File "vaex\dataframe.py", line 928, in count
  File "vaex\dataframe.py", line 902, in _compute_agg
  File "vaex\dataframe.py", line 1672, in _delay
  File "vaex\dataframe.py", line 412, in execute
  File "vaex\execution.py", line 181, in execute
  File "vaex\execution.py", line 186, in run
  File "vaex\asyncio.py", line 51, in just_run
  File "nest_asyncio.py", line 81, in run_until_complete
  File "asyncio\futures.py", line 181, in result
  File "asyncio\tasks.py", line 249, in __step
  File "vaex\execution.py", line 334, in execute_async
  File "vaex\memory.py", line 37, in create_tracker
ValueError: No memory tracker found with name default
[1272] Failed to execute script 'main' due to unhandled exception!

Software information

  • Vaex version (import vaex; vaex.__version__): {'vaex-core': '4.7.0'}
  • Vaex was installed via: pip
  • Python: 3.7.9
  • Pyinstaller: 4.7
  • OS: Win10
@maartenbreddels
Copy link
Member

Hi,

thanks for sharing this.
I think pyinstaller is not picking up entry points for some reason.
Those are listed in

'default = vaex.memory:MemoryTracker'

Does this help you?

Regards,

Maarten Breddels

@styliann-eth
Copy link

I'm having the same error. What you're mentioning is included in entry_points.txt but I don't know how to solve that.

@schwingkopf
Copy link
Author

I succeeded in following the hints from pyinstaller/pyinstaller#3050 and added the following to my .spec file:

# Helper function to make iter_entry_points work e.g. for vaex
# copied and modified from https://github.com/pyinstaller/pyinstaller/issues/3050
def prepare_entrypoints(ep_packages): 
    
    hook_ep_packages = dict()
    hiddenimports = set()
    runtime_hooks = list()
    
    if not ep_packages:
        return list(hiddenimports), runtime_hooks
        
    for ep_package in ep_packages:
        for ep in pkg_resources.iter_entry_points(ep_package):
            if ep_package in hook_ep_packages:
                package_entry_point = hook_ep_packages[ep_package]
            else:
                package_entry_point = []
                hook_ep_packages[ep_package] = package_entry_point
            package_entry_point.append("{} = {}:{}".format(ep.name, ep.module_name, ep.attrs[0]))
            hiddenimports.add(ep.module_name)

    try:
        os.mkdir('./generated')
    except FileExistsError:
        pass

    with open("./generated/pkg_resources_hook.py", "w") as f:
        f.write("""# Runtime hook generated from spec file to support pkg_resources entrypoints.
ep_packages = {}

if ep_packages:
    import pkg_resources
    default_iter_entry_points = pkg_resources.iter_entry_points

    def hook_iter_entry_points(group, name=None):
        if group in ep_packages and ep_packages[group]:
            eps = ep_packages[group]
            for ep in eps:
                parsedEp = pkg_resources.EntryPoint.parse(ep)
                parsedEp.dist = pkg_resources.Distribution()
                yield parsedEp
        else:
            return default_iter_entry_points(group, name)

    pkg_resources.iter_entry_points = hook_iter_entry_points
""".format(hook_ep_packages))
    
    runtime_hooks.append("./generated/pkg_resources_hook.py")
    
    return list(hiddenimports), runtime_hooks

# List of packages that should have their "Distutils entrypoints" included.
ep_packages = ["vaex.memory.tracker"]

hiddenimports, runtime_hooks = prepare_entrypoints(ep_packages)

and then add the hiddenimports and runtime_hooks to the arguments of Analysis like so:

a = Analysis(
    ...
    hiddenimports=hiddenimports,
    runtime_hooks=runtime_hooks,
)

Hope that helps

@rajeebdash
Copy link

rajeebdash commented May 31, 2022

I am using Auto py to exe GUI and facing the same issue
Exception in Tkinter callback ... Can someone help how to resolve it using GUI

Traceback (most recent call last):
  File "tkinter\__init__.py", line 1702, in __call__
  File "KPI_Automation_GUI.py", line 302, in startConversion
    startConversion_mf4()
  File "KPI_Automation_GUI.py", line 215, in startConversion_mf4
    match_extract_txt.match_and_Extract(textfilelist, str(DriveEnv.get()))
  File "match_extract_txt.py", line 2142, in match_and_Extract
    df.export_hdf5(Databasehdf5FilePath_temp, progress=True, chunk_size=1000000, parallel=True, mode='w')
  File "vaex\dataframe.py", line 6907, in export_hdf5
    with vaex.utils.progressbars(progress, title="export(hdf5)") as progressbar:
  File "vaex\utils.py", line 988, in progressbars
    return tree(*args, **kwargs)
  File "vaex\progress.py", line 206, in tree
    return ProgressTree(bar=bar(title=title), next=next, name=name)
  File "vaex\progress.py", line 181, in bar
    return _progressbar_registry[type_name](title=title)
  File "vaex\utils.py", line 75, in __getitem__
    raise NameError(f'No {self.typename} registered with name {name!r} under entry_point {self.entry_points!r}')
NameError: No progressbar registered with name 'simple' under entry_point 'vaex.progressbar'

@leprechaunt33
Copy link

leprechaunt33 commented Jan 15, 2023

It seems Python 3.10 breaks the fix above for PyInstaller due to the new importlibs.metadata being used instead. For now however, I've fixed this for my own project by editing dataset.py and memory.py to add the code from the generated python hook and set entry_points to the hook function. utils.py may also need to be overidden in some use cases, but that didn't turn out to be needed for bare hdf5/csv access.

If any hidden imports are missing after using the entry points fix above, they can be identified by using something like this:

        modlist=open('modules.txt','w')
        print(json.dumps(sorted(list(sys.modules.keys())), indent=4),file=modlist)
        modlist.close()

and doing a diff between the running python version and compiled exe version, then filtering the results. Its possible simply changing the import in the hook file may fix the problem, but I haven't tested that possibility yet, as it was unclear whether the direct import of entry_points would override the hook when the module was imported a second time. Either way, combining the fix given with one of these two possibilities will yield a working app. To prevent breaking the development process, I just installed a 3.10 parallel to the development environment for the build so that it doesn't matter if the installed files are edited.

EDIT: Adding for those trying to build apps relying on vaex-viz, the lazy accessors are set in init.py, applying the same monkey patch for vaex.dataframe.accessor and vaex.expression.accessor in init.py will resolve the problem.

@intelligibledata
Copy link

intelligibledata commented Jun 12, 2023

Based on the discussion above and looking in the related issues I have not been able to find a solution to this problem. I'm running Vaex 4.16 on python 3.10 in conda with the following basic example:

`import vaex as vx
from vaex.hdf5.dataset import Hdf5MemoryMapped, AmuseHdf5MemoryMapped, Hdf5MemoryMappedGadget
vx.settings.main.memory_tracker.type = 'default'

vx.dataset.opener_classes = [Hdf5MemoryMapped,AmuseHdf5MemoryMapped,Hdf5MemoryMappedGadget]

df = vx.open(r"c:\20220613.hdf5")
print(df.head)
df.select(df['date'] >= "2022-06-01", mode='and' )
print("count: ", df.count(selection=True))
df.select(df['starttime'] >="2022-06-13 14:00:00" , mode='and' )
print("count: ", df.count(selection=True))
df.close()
`
This will give the correct result when running as a script. When running as an executable it gives the correct output for the df.head but the df.count results in the memory tracker issue as mentioned in this thread. I would really appreciate some help solving this as I am currently not able to package my solution as an executable.

@fqking
Copy link

fqking commented Oct 8, 2023

It's been more than a year. Does anyone know the solution? Thank you in advance.

@gostdi
Copy link

gostdi commented Oct 10, 2023

pyinstaller -hidden-import vaex.viz --hidden-import vaex.astro.legacy --recursive-copy-metadata vaex fixed issues with Vaex 4.17 on Python 3.10

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

8 participants