Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Move into package (and add setup.py) and build AppImage #94

Closed
wants to merge 12 commits into from

Conversation

frankier
Copy link
Contributor

@frankier frankier commented Feb 8, 2022

Thanks your work on this project. It works great! I thought I would do some work so that people can get it running easier.

This PR moves everything into a Python package and adds a script to build an AppImage. Having things in a Python package will make it easier to potentially make other packages in the future.

There are still some things to do which I will list below, but if you think this is a good direction it would be good to merge as your earliest convenience since there could be a lot of conflicts with the move into a Python package.

Here is the TODO:

  • Reduce the AppImage size (<500mb should be possible -- current size is >1GB)
    • Only package CPU PyTorch
    • See if it's possible to get rid of caffe
  • Build AppImage in GitHub CI
  • Add a "download models" script so that we can publish to pypi and give pipx based installation instructions too.
  • Relativise all paths to the models instead of using chdir

Eventually, if inference is e.g. moved to ONNX, then the training code could be separated into a whole separate package from the inference. For now, I think this structure works well enough.

@frankier frankier changed the title Package and appimage Move into package (setup.py) and build appimage Feb 8, 2022
@frankier frankier changed the title Move into package (setup.py) and build appimage Move into package (and add setup.py) and build AppImage Feb 8, 2022
@frankier
Copy link
Contributor Author

Let me know if/when you'd like to move forward with this. One possibility if you don't like the way things have been reorganized here, is that you could reorganise it into a one or more packages some other way (necessary for setup.py I think) and I can try to rebase the rest of the commits on top of your work.

@lukas-blecher
Copy link
Owner

I like the setup and will merge this PR in the future. I have to test the changes first but I don't have the time at the moment.
Thanks for the contribution.

@lukas-blecher
Copy link
Owner

I've added a download models script and added it to the setup.py. Not sure if this is how it's meant to be used, but it works.
It might be problematic if requests are not installed.

When trying to build the appimage I get an error

[=======================================================|    ] 46700/49352  94%
#15 443.3 FATAL ERROR:gzip uncompress failed with error code -3
#15 443.3 Embedding ELF...
#15 443.3 Not able to open the AppImage for writing, aborting
#15 443.3 /tmp/python-appimage-j6hecx0g/AppDir should be packaged as pix2tex-x86_64.AppImage
#15 443.3
#15 445.3 Traceback (most recent call last):
#15 445.3   File "/usr/local/lib/python3.8/runpy.py", line 194, in _run_module_as_main
#15 445.3     return _run_code(code, main_globals, None,
#15 445.3   File "/usr/local/lib/python3.8/runpy.py", line 87, in _run_code
#15 445.3     exec(code, run_globals)
#15 445.3   File "/usr/local/lib/python3.8/site-packages/python_appimage/__main__.py", line 113, in <module>
#15 445.3     main()
#15 445.3   File "/usr/local/lib/python3.8/site-packages/python_appimage/__main__.py", line 109, in main
#15 445.3     command.execute(*command._unpack_args(args))
#15 445.3   File "/usr/local/lib/python3.8/site-packages/python_appimage/commands/build/app.py", line 297, in execute
#15 445.3     build_appimage(destination=destination)
#15 445.3   File "/usr/local/lib/python3.8/site-packages/python_appimage/appimage/build.py", line 63, in build_appimage
#15 445.3     raise RuntimeError('Could not build AppImage')
#15 445.3 RuntimeError: Could not build AppImage

This might very well be because I messed around in the setup and dockerfile.

@frankier
Copy link
Contributor Author

frankier commented Apr 9, 2022

Hi! Thanks for taking a look at this.

I have pulled a copy with your changes and it works fine for me. The first thing to do is figure out what's going wrong here. This seems to be the original error "FATAL ERROR:gzip uncompress failed with error code -3". A gzip uncompress failed, but I can't find any info except for here AppImage/AppImageKit#398 Are you running out of disk space or RAM? Can you try on another machine?

I am making some other comments but not acting on them myself (yet), since I think we should make sure you are able to build/test yourself first. If you agree I can fix up the below two points:

1) I'm not sure about putting arbitrary code into setup.py. This seems to be heavily discouraged nowadays and is likely to lead to headaches in the future. Like you mentioned, it actually adds an extra layer of "pre-install" dependencies to make sure the setup.py can run before the dependencies are fetched. This will also make it harder to migrate to setup.cfg which is what the PyPA are pushing towards. (i.e. they want to get rid of setup.py altogether) In general Python packages cannot assume they are able to execute arbitrary code at installation time since this will not work for whl packages. What might be possible if we don't want to require explicitly instructing users to run the download script (which in this case could be installed as an entrypoint in setup.py to make it nicer), would be to make the script idempotent and attempt to run it every time either pix2tex_cli or pix2tex_gui is started, so that they will "self-bootstrap". In fact this seems to be already being done in pix2tex_cli, so perhaps we can leave it like that and remove the setup.py helper? The script can be run explicitly in the Dockerfile.

2) The new download script does not play nicely with some of the changes I made. In my original changes I am always accessing the model files using the following helper:

@contextlib.contextmanager
def in_model_path():
    from importlib.resources import path
    with path('pix2tex', 'model') as model_path:
        os.chdir(model_path)
        saved = os.getcwd()
        try:
            yield
        finally:
            os.chdir(saved)

What this does is uses importlib.resources.path to get a path within the pix2tex module, wherever this currently resolves to according to PYTHONPATH. But the downloading script and the check in cli.py are still working relative to the working directory. This means that for example when I run the AppImage it attempts to redownload the models because the check does is not look within the pix2tex/models path within the mounted AppImage.

@lukas-blecher
Copy link
Owner

lukas-blecher commented Apr 10, 2022

First of all thank you for the fast and very detailed answer.
You were right. When I gave Docker more memory the script completed without a problem. Though right now it seems like I introduced some errors when executing the AppImage.
I think I will split this PR into two, one for the package and one for the AppImage.
In my opinion the package is more important and as you mentioned above the AppImage is quite large, so it would make more sense if the model can be converted to ONNX or similar.

@lukas-blecher lukas-blecher added the enhancement New feature or request label Apr 12, 2022
@lukas-blecher lukas-blecher mentioned this pull request Apr 13, 2022
@lukas-blecher
Copy link
Owner

I already merged the package part into main (sorry I did a squash commit without understanding what it does, so you won't show up as contributor yet)
The AppImage part is now in #120 (no squash there)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants